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Structured  Approaches  for  Problems  of  Network  Design  and  Utilization 


Michael  R.  Fellows,  University  of  Idaho 
(Contract  N00014-88-K-0456) 

Michael  A.  Langston,  Washington  State  University 
(Contract  N00014-88-K-0343) 


Abstract. 

The  effectiveness  of  two  general,  structured  approaches  to  broad  classes  of  network  design 

and  utilization  problems  is  investigated.  These  are: 

(1)  an  approach  to  network  algorithmic  problems  based  on  well-partial  orders  (wpo’s)  on 
sets  of  combinatorial  objects,  where  the  goal  is  to  develop  this  powerful  mathematical 
perspective  into  a  foundation  for  practical  algorithms  and 

(2)  an  approach  to  symmetric  and  fault-tolerant  interconnection  network  design  and  allo¬ 
cation  problems  employing  algebra  and  coding  theory,  where  the  goal  is  to  establish 
effective  design  paradigms  drawing  on  established  mathematical  resources. 

The  results  obtained  in  these  research  areas  during  this  period  of  ONR  support  and  described 

in  this  progress  report  include: 

(la)  general  methods  for  overcoming  the  major  difficulties  in  obtaining  practical  algorithms 
for  network  problems  from  wpo-based  tools, 

(lb)  improved  practical  algorithms  for  some  well-known  algorithmic  problems  of  networks, 

(2a)  a  host  of  record-breaking  algebraic  constructions  in  the  range  of  engineering  significance 
for  the  much-studied  d^ree/diameter  network  construction  problem, 

(2b)  basic  results  for  the  planar  and  planar-symmetric  versions  of  the  degree/ diameter  net¬ 
work  construction  problem,  and 

(2c)  useful  schemes  for  locally  complete  data  distribution  in  networks,  and  general  methods 
for  employing  algebraic  network  descriptions  to  solve  this  problem. 
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Background  and  Objectives. 

Networks  of  many  lands  play  an  increasing  role  in  almost  every  aspect  of  modem  sdenoe 
and  technok^,  and  figure  centrally  in  the  forefront  of  developments  in  computer  science. 
Problems  concerning  the  design,  organization  and  utilization  of  networks  play  a  correspond¬ 
ingly  important  role.  For  these  problems,  it  is  desirable  to  have  useful  general  tools  and 
methodologies  that  are  organizing  principles^  that  is,  approaches  that  can  be  applied  to 
broad  classes  of  particular  problems. 

Our  research  has  been  centered  on  the  development  of  two  such  broad  perspectives  on  net¬ 
work  design  and  algorithmic  problems,  both  of  which  are  based  on  strong  mathematical 
resources.  In  the  first,  we  seek  to  develop  the  theoretical  basis  of  wpo-based  tools  so  that 
they  might  provide  a  foundation  for  practical  networks  algorithms.  In  the  second,  we  en¬ 
deavor  to  demonstrate  the  effectiveness  of  algebraic  methods  for  problems  of  network  design. 
Our  research  program  recognizes  and  addresses  these  aspects: 

•  the  emergence  of  the  importance  of  network  problems, 

•  the  need  to  develop  more  powerful  and  wdl-integrated  theoretical  perspectives  on  net¬ 
work  problems,  and 

•  the  opportunity  provided  by  the  recent  fundamental  mathematical  breakthroughs  of 
Robertson  and  Seymour,  and  others. 

Research  Issues,  Approaches  and  Progress. 

Our  researdi  objectives  have  been  formulated  at  the  level  of  addressing  issues  concerning 
fundamental  feasibility  of  these  approaches,  rather  than  that  of  the  creation  of  software 
or  systems.  At  this  level  we  have  already  achieved  many  of  the  objectives  in  our  original 
proposal.  We  therefore  take  the  opportunity  presented  by  this  progress  report  to  articulate 
an  updated  set  of  research  objectives  for  this  general  line  of  investigation. 

wpo-hased  methods 

Our  earlier  research  has  demonstrated  the  wide  applicability  of  wpo-based  tools  to  many 
well-known,  previously  challenging,  interesting  and  useful  zdgorithmic  problems.  Such  appli¬ 
cations,  however,  left  unresolved  a  seemingly  enormous  gap  between  these  theoretical  results 
and  any  potential  for  practical  algorithms.  The  major,  virtually  unprecedented  issues  to  be 
faced  include: 

a.  the  polynomial-time  algorithms  guaranteed  by  the  wpo-related  theorems  are  noncon- 
structively  proven  only  to  exist, 

b.  the  algorithms  promised  involve  behemoth  constants  (towers  of  2’s  of  height  described 
by  towers  of  2’s  of  height  ...),  and 

c.  the  algorithms  promised  solve  only  decision  versions  of  the  problems  to  which  the 
theorems  apply. 
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Confronting  these  difficulties  would  seem  to  constitute  an  adequate  agenda,  even  for  a  pri¬ 
marily  theoretical  investigation!  Indeed,  so  much  so  that  at  least  one  prominent  member  of 
the  research  community  predicted  that  wpo-based  techniques  would  never  amount  to  more 
than  a  **mathematical  curiosity,”  at  best  a  signpost  for  polynomial  time. 

Yet,  by  the  results  we  have  recently  presented  at  STOC  89,  these  apparent  roadblocks 
have,  for  the  most  part,  been  decisively  removed  for  most  of  the  known  applications.  Other 
researchers  (for  example,  Bodlaender)  have  already  begun  to  augment  and  extend  our  tech¬ 
niques  for  addressing  issues  b  and  c.  We  have  recently  obtained  even  stronger  methods  for 
addressing  issue  a. 

A  remaining  major  issue  (the  size  of  obstruction  sets)  is  an  important  new  focus  in  this 
project.  Along  with  Nancy  Kinnersley,  a  1989  Ph.D.  graduate  from  Washington  State  Uni¬ 
versity  (whose  graduate  research  was  supported  by  this  contract,  and  who  has  just  accepted 
a  faculty  position  at  the  University  of  Kansas),  we  have  recently  completed  identification  of 
the  110  obstructions  to  pathwidth  2.  On  the  basis  of  this  work  we  have  gathered  strong  evi¬ 
dence  to  suggest  that  not  all  obstructions  are  created  equal,  and  that  for  many  applications 
the  difficulty  can  be  overcome  by  employing  a  reasonably  small  approximate  obstruction  set 
that  embodies  almost  all  of  the  structural  information  about  the  prob!^. 

Our  current  objectives  include: 

•  a  structure  theory  for  obstruction  sets  that  albws  us  to  understand  the  possibilities 
for  approximate  obstruction  sets, 

•  an  understanding  of  the  possibilities  for  randomized  multiple-trial  self-reduction  algo¬ 
rithms,  and  how  these  might  interact  with  approximate  obstruction  sets, 

•  the  exploration  of  possible  self- reduction  algorithms  that  are  designed  to  work  correctly 
almost  always  employing  only  a  small  set  of  obstructions,  and 

•  continued  research  on  faster  order  tests  for  the  important  wpo  sets. 
algebraic  methods  for  network  design 

The  chief  advantage  of  an  algebraic  approach  is  that  for  some  applications  of  large  and 
complex  networks  that  are  presently  contemplated  (for  example,  in  parallel  processing)  it  is 
natural  to  use  symmetry  as  an  organizing  principle.  (Thus,  for  example,  in  a  vertex  symmet¬ 
ric  network  one  might  just  write  one  message-routing  protocol  for  a  node,  and  translate  it 
into  one  for  all  the  other  nodes.)  By  symmetry  one  inevitably  means  group  theory,  for  which 
we  have  well-established  mathematical  knowledge  on  which  to  draw.  Algebraic  network 
descriptions  have  other  organizational  advantages,  including  being  compact  and  comprehen¬ 
sible  —  this  can  support  efficient  routing  computations,  and  can  be  an  aid  to  exploiting  the 
symmetries  of  a  computational  problem. 

When  we  began  this  project  two  years  ago  we  knew  but  a  handful  of  largest  known  con¬ 
structions  of  networks  of  a  given  degree  and  diameter  by  algebraic  means  (especially  Cayley 
graphs).  This  has  been  a  much-studied  problem  and  at  the  time  of  our  proposal  almost 
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all  of  the  largest  known  constructions  for  the  range  of  the  parameters  (degree,  diameter)  of 
potential  engineering  significance  had  been  obtained  by  various  authors,  mostly  by  means  of 
an  assortment  of  graph-theoretic  compositional  techniques. 

Since  then  we  have  rewritten  the  table  almost  completely,  and  decisively  demonstrated  the 
power  of  an  algebraic  approach  to  this  well-known  network  design  problem.  This  was  the 
principle  initial  objective  for  this  topic  in  our  original  proposal.  Along  with  C.  S.  Jagadish, 
a  graduate  student  at  the  University  of  Idaho  supported  under  this  award,  we  have  made 
basic  advances  on  the  planar  and  planar  vertex  symmetric  variations  on  this  problem. 

Another  problem  that  our  research  has  addressed  is  that  of  devising  efficient  schemes  for 
storing  partial  copies  of  a  database  at  the  nodes  of  a  network  so  that  each  node  has  a  complete 
copy  of  the  entire  database  in  its  immediate  neighborhood.  Algebraically  described  networks 
support  efficient  algebraic  solutions  to  this  problem;  we  have  obtained  asymptotically  optimal 
solutions  for  the  hypercubes,  in  the  process  settling  in  the  affirmative  a  conjecture  in  coding 
theory  due  to  Cohen. 

Our  current  objectives  include: 

•  to  improve  the  largest  known  degree/diameter  constructions  for  small  diameter  values 
by  employing  Cayley  coset  graphs, 

•  to  explore  an  algebraic  approach  to  the  degree/broadcast  diameter  construction  prob¬ 
lem, 

•  to  explore  ways  of  using  the  algebraic  descriptions  for  classes  of  good  constructions  to 
efficiently  solve  routing  and  deadlock  problems,  and 

•  to  continue  our  exploration  of  locally  complete  data  distribution  schemes  in  alge¬ 
braically  described  networks,  seeking  especially  improved  schemes  for  small  dimension 
hypercubes  and  other  network  families  important  in  parallel  processing. 

Research  Directions. 
wpo-based  methods 

At  this  point,  this  exciting  research  frontier  appears  to  be  significantly  undermanned  relative 
to  its  potential,  perhaps  in  part  due  to  the  initial  bad  publicity  surrounding  the  difficulties 
a  through  c  described  above,  and  the  daunting  nature  of  the  founding  mathematical  results 
(the  completed  proof  of  the  main  Robertson-Seymour  theorems  concerning  the  minor  order 
of  graphs  comprises  approximately  1600  journal  pages).  It  does  ^m  remarkable  that  the 
many  interesting  consequences  of  these  deep  theorems,  and  the  basic  questions  for  computer 
science  that  they  raise,  were  essentially  absent  from  the  major  conferences  in  theoretical 
computer  science  for  a  period  of  almost  5  years  after  their  announcement.  It  is  quite  difficult 
to  imagine  a  similar  intellectual  sociology  occurring  in,  say,  physics  or  chemistry. 

There  are  a  great  many  leads  worth  pursuing.  The  more  theoretical  aspects  of  the  theory 
and  its  applications  to  computer  science  are  beginning  to  gain  researchers  and  momentum. 
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perhaps  for  the  most  part  recruited  from  mathematics.  As  we  have  reported,  however,  most 
of  the  major  issues  for  a  host  of  potentially  interesting  applications  have  been  overcome. 
The  time  is  thus  ripe  for  significant  experimental  exploration  with  implemented  algorithms, 
but  there  is  as  of  yet  almost  no  such  researdi  activity,  suggesting  that: 

s  the  methods  for  computing  and  mechanically  verifying  obstruction  sets  that  we  have 
recently  developed  should  be  tried  out  aq>erimentally  for  some  small  applications, 

•  a  library  of  implementations  of  the  best  known  minor  tests  should  be  assembled,  and 

s  the  ‘‘learning  algorithm”  constructivization  should  be  explored  experimentally  for  some 
small  examples. 

Note  that  an  attractive  feature  of  wpo<based  tools  is  their  modular  nature.  For  a  given 
application,  the  relevant  order  tests  can  be  performed  trivially  in  paralld,  and  a  single  order 
test  might  be  useful  in  many  different  applications,  and  can  thus  reside  in  a  universal  library. 
Perhaps  this  library  itself  can  be  to  some  extent  mechanically  generated. 

algebraic  methods  for  network  design 

The  major  direction  that  awsuts  exploration  based  on  our  results  is  whether  the  algebraic 
constructions  that  we  have  identified  can  be  fully  developed  into,  for  example,  superior 
alternatives  to  the  hypercubes  for  parallel  processing  networks.  This  necessarily  involves 
consideration  of  many  more  aspects  than  merely  degree /diameter  properties.  We  have  re¬ 
cently  begun  to  explore  a  number  of  these  aspects  in  collaboration  with  Vance  Faber  of  the 
Los  Alamos  National  Laboratory.  The  intuition  behind  our  work  is  that  much  of  what  makes 
the  hypercube  (and  most  other  proposed  network  topologies)  attractive  is,  essentially,  the 
fact  that  it  has  an  easily  manipulable  algebraic  description. 

A  Grand  Challenge. 

Wpo-based  tools  should  not  be  perceived  as  exotic  or  special,  but  rather  as  a  kind  of  “gen¬ 
eralized”  brute  force.  Interesting  families  of  combinatorial  objects  are  often  closed  under 
local  operations  by  which  a  partial  ordering  of  the  objects  can  be  defined.  The  basic  rule  of 
thumb  is  that  in  some  sufficiently  restricted  setting  this  partial  order  is  a  wpo  that  supports 
wpo-based  complexity  tools. 

For  some  sets  of  operations  (e.g.,  those  defining  the  minor  and  immersion  orders)  the  setting 
is  simply  all  the  objects,  while  for  others  the  best  setting  available  is  more  restrictive.  For 
example,  the  operations  (1)  remove  a  subdivision  and  (2)  take  a  subgraph  (which  define  the 
topological  order)  do  not  yield  a  well-partial  order  in  general,  but  this  does  constitute  an 
wpo  in  the  setting:  all  graphs  that  do  not  contain  k  disjoint  cycles.  (We  have  recently  shown 
that  for  this  wpo  all  order  tests  can  be  done  in  linear  time.) 

The  Hilbert-sized  problem  is  to  provide  a  comprehensive  explanation  of  what  sets  of  opera¬ 
tions  on  what  sets  of  objects  yield  wpo-based  complexity  tools. 
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Raiearch  T^antitiont. 

At  this  timef  there  is  an  enormous  cultural  and  intellectual  gap  in  many  problem  domains 
between  researdiers  who  explore  algorithmic  problems  of  graphs  and  networks  theoretically, 
and  engineering  practitioners  who  are  forced  to  "hack  away”  at  problems  that  must  be  dealt 
with  immediately,  somehow.  What  realistically  can  be  done  about  this  situation  is  not 
completely  clear. 

Technological  Impacts. 

Our  research  is  primarily  theoretical  in  nature  and  thus  is  not  significantly  impacted  by  or 
awaiting  new  developments  in  hardware  and  software  tools.  One  rather  minor  exception 
to  this  statement  concerns  our  recent  computational  exploration  of  some  Cayley  networks 
having  in  the  range  of  0.5  to  10  million  nodes.  A  problem  we  encountered  concerned  finding 
a  machine  with  sufficient  fast  memory. 

Societal  Issues. 

We  seem  to  have  entered  an  era  in  which  the  great  majority  of  the  most  able  and  motivated 
graduate  students  are  foreign  students.  We  believe  this  is  a  partial  reflection  of  the  fact 
that  graduate  study  is  so  very  poorly  supported,  despite  the  often  major  contributions  to 
the  research  frontier,  even  in  the  short  run,  that  are  due  to  energetic  and  talented  students. 
A  truly  vigorous  program  for  research-oriented  graduate  study  would  be  a  highly  leveraged 
investment  in  this  nation’s  security  and  quality  of  life. 

Recommendations  to  Funding  Agencies. 

Science  funding  sometimes  seems  to  us  to  be  rather  akin  to  the  child’s  toy  “chinese  hamd- 
cufFs,”  in  that  the  more  that  funding  agencies  strain  for  short-term  payoffs  and  objectives, 
the  less  progress  will  be  made  in  the  long  haul.  We  believe  that  it  should  be  the  objective  of 
science  funding  agencies  to  provide  broad-based  support  of  science  and  facilitate  the  transfer 
of  basic  science  towards  classrooms,  industry,  defense  and  other  applications.  It  also  seems 
to  us  that  basic  science  is  most  efficiently  supported  by  grants  to  individual  researchers  or 
small  groups,  with  funding  decisions  determined  heavily  by  scientific  peer  review,  or  some 
process  that  is  highly  respectful  of  a  science’s  critical  self-evaluation. 


ALGORITHMS  BASED  ON  GRAPH  DECOMPOSITION 

MATTHIAS  F.M.  STALLMANN* 


Abstract.  This  progress  report  describes  research  on  dynamic  programming  algo¬ 
rithms  amd  related  issues.  Applications  include  problems  in  parameter  estimation  in 
PERT  networks,  network  reliability,  algorithms  for  NP-hard  graph  problems,  hyper¬ 
cube  embedding,  and  graph  coloring  problems  that  arise  in  code  optimization  and  via 
minimization  for  VLSI. 

1.  Background.  This  research  project  has  evolved  into  one  whose  central  issue  is 
d3mamic  programming.  It  was  not  intended  that  way  originally,  nor  did  I  approach  it  as 
someone  who  was  interested  in  becoming  .an  expert  on  dynamic  programming.  I  should 
clarify  at  the  outset  that  my  use  of  the  phrase  "dynamic  programming”  is  not  a  technical 
term,  as  it  might  be  in  certain  circles  of  the  operations  research  community.  It  is  rather 
a  broad,  but  vaguely  defined,  framework  for  algorithm  design  in  which  solutions  to 
large  instances  of  a  problem  are  built  up  from  solutions  of  smaller  instances  of  the  same 
problem.  Most  of  the  standard  textbooks  on  algorithm  design  recognize  the  importance 
of  dynamic  programming  as  an  algorithm  design  technique  (see  e.g.  [2]).  Because  of 
its  utility  in  solving  practical  problems,  it  could  be  argued  that  dynamic  programming 
is  the  most  important  algorithm  design  technique.  Dynamic  progranuning  algorithms 
are  easy  to  formulate  and  easy  to  implement  and  are  often  very  efficient.  Robustness 
is  the  main  selling  point  of  dynamic  programming  algorithms  -  it  is  usually  easy  to 
incorporate  additional  constraints  or  to  adapt  from  a  cardinality  problem  to  a  weighted 
problem. 

I  came  around  to  this  view  only  recently,  having  been  schooled  in  all  the  fine  points 
of  efficient  data  structures,  graph  searching,  and  augmenting  path  algorithms  (all  of 
which  have  been  extremely  useful).  When  I  came  to  NCSU,  I  decided,  since  I  was  now 
a  theoretician  in  a  place  that  put  great  emphasis  on  practical  applications,  to  do  a 
lot  of  listening  to  colleagues  and  students,  and  to  glean  from  them  the  areas  in  which 
theory  might  prove  useful.  I  kept  a  particular  lookout  for  problems  that  were  suspected 
to  be  intractable  (NP-hard  or  worse),  but  had  no  natural  structure  to  suggest  m  NP- 
hardness  proof,  hoping  to  find  matroid  parity  problems  lurking  in  them  (since  my  thesis 
was  about  matroid  parity). 

I  haven’t  stumbled  across  any  matroids,  but  I  have  become  better  at  proving  NP- 
completeness  results  and  at  devising  dynamic  programming  algorithms  for  problems 
that  are  not  NP-hard.  The  problem  which  is  the  central  focus  of  this  report,  directed 
acyclic  graph  reduction,  has  an  interesting  story  behind  it  that  illustrates  how  I  have 
chosen  some  of  the  problems  I’ve  worked  on,  or  rather  how  they  have  chosen  me. 

Some  time  in  late  1986  or  early  1987,  Salah  Elmaghraby  asked  me  if  there  was 
an  efficient  way  to  enumerate  all  the  IG’s  (subgraphs  homeomorphic  from  the  dag 
pictured  in  Figure  1)  of  a  dag.  He  was  in  the  process  of  writing  a  survq^  paper  on 

*  Department  of  Computer  Science,  North  Carolina  State  University,  Raleigh,  NC  27695-8206 
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Fia.  1.  The  iuieriietive  inpk  (IG). 

parameter  estimation  in  PERT  networks,  and  getting  rid  of  IG’s  appeared  to  be  a 
central  issue.  I  remembered  a  result  about  subgraph  homeomorphism,  that  it  could 
be  done  in  polynomial  time  for  dags  [19],  and  pointed  out  that  this  could  be  used,  at 
the  very  least,  to  list  all  pairs  of  vertices  that  were  in  positions  v  and  w  of  an  IG  (see 
Figure  1).  Then  I  made  the  mistake  of  asking  why  he  wanted  to  do  such  a  thing.  The 
real  problem  he  was  after  was  to  minimize  the  number  of  node  reductions  required  to 
reduce  a  dag  to  a  single  edge  (see  Section  4  for  a  full  description).  He  felt  sure  that  the 
problem  was  NP*hard  (maybe  I  could  help  him  prove  it)  and  the  enumeration  of  the 
IG’s  would  reduce  it  to  vertex  cover,  a  more  manageable  problem.  Some  wedcs  later 
I  made  the  observation  that  the  vertex  cover  problem  he  was  hoping  to  reduce  to  (a) 
did  not  model  the  original  problem  exactly,  and  (b)  when  it  did,  the  graph  defining 
the  vertex  cover  problem  was  transitive,  which  made  me  suspicious  that  there  might 
be  a  polynomial  time  algorithm  for  the  original  problem.  After  more  than  a  year  of 
working  through  examples  and  counterexamples  of  a  variety  of  different  conjectures, 
and  enUsting  the  aid  of  a  colleague  from  Duke  (Wolfgang  Bein,  who  knew  a  lot  more 
about  series-parallel  dags  than  I  did),  I  developed  a  polynomial  time  algorithm  for  the 
original  problem,  reducing  it  to  vertex  cover  in  a  transitive  dag  [48].  It  took  me  edmost 
another  year  to  fully  understand  the  applications,  and  I’m  still  not  convinced  that  the 
right  quantity  is  being  optimized  for  some  of  them. 

Similar  stories  are  behind  my  current  work  on  via  minimization  in  VLSI  and  hy¬ 
percube  embedding,  except  that  in  each  of  these  cases  it  was  a  student,  rather  than  a 
colleague  from  another  department,  who  provided  my  original  contact  with  the  problem. 

2.  Researdi  Objectives.  Many  graph  problems  arising  in  practical  applications 
are  NP-hard  when  stated  as  general  graph  problems,  but  may  in  fact  be  easy  when  the 
special  structure  of  graphs  arising  in  the  application  is  considered.  Johnson  [25]  gives  a 
survey  of  results  applying  to  many  special  classes  of  graphs.  Many  of  the  polynomial- 
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time  results  are  due  to  dynamic  programming  algorithms  that  take  advantage  of  some 
form  of  decomposition  for  the  graphs  in  question.  The  long-term  objective  of  this  re¬ 
search  is  to  develop  general  techniques  for  obtaining  polynomial-time  algorithms  for 
special  cases  of  NP-hard  problems,  particularly  graph  problems.  Several  shorter  term 
objectives  flow  naturally  out  of  previous  work.  One  is  to  extend  linear-time  dynamic 
programming  algorithms  to  larger  classes  of  graphs  than  the  ones  in  which  they  are  cur¬ 
rently  used  (see,  for  example,  [4]).  Another  is  develop  efficient  algorithms  for  a  variety  of 
problems  on  graphs  that  are  ‘‘nearly”  series-parallel  (see  [48]  for  one  approach).  Finally, 
to  render  the  long-term  objective  more  manageable,  it  is  necessary  to  find  linear- time 
or  log-space  reductions  among  problems  known  to  be  in  P.  The  importance  of  this  is 
discussed  further  in  the  following  section. 

3.  Research  Issues.  Dynamic  programming  has  been  used  successfully  as  a  gen¬ 
eral  technique  for  obtaining  efficient  algorithms  for  problems  on  special  classes  of  graphs, 
and  for  obtaining  polynomial-time  algorithms  in  general.  For  example.  Prim’s  algo¬ 
rithm  for  minimum  spanning  trees  and  Dijkstra’s  algorithm  for  shortest  paths  can  be 
construed  as  special  cases  of  dynamic  programming.  Examples  using  special  classes 
of  graphs  are  legion  (see  [25]  for  a  partial  surv^r).  Unifying  approaches,  such  as  that 
taken  by  Bern  et  al  [4]  are  particularly  helpful  in  the  design  of  efficient  and  eluant  algo¬ 
rithms.  Other  than  dynamic  programming,  the  main  general  techniques  for  obtaining 
polynomial-time  algorithms  are  greedy,  divide  and  conquer,  successive  augmentation, 
and  linear  programming.  Greedy  and  divide  and  conquer  can  be  viewed  as  special  cases 
of  dynamic  programming.  The  other  two  techniques  are  more  specialized,  but  in  some 
cases,  for  example  network  flows,  are  being  supplanted  by  simpler  techniques.  This  sug¬ 
gests  that  dynamic  programming  is  a  universal  technique  for  obtaining  polynomial-time 
algorithms. 

A  more  general  but  related  issue  is  the  following.  When  tackling  a  new  computa¬ 
tional  problem,  a  researcher  is  often  focused  on  the  question,  “can  I  design  an  algorithm 
for  this  problem?”  This  leads  to  a  proliferation  of  algorithms,  some  essentially  identical 
to  others  already  published.  One  of  the  goals  of  theoretical  research  in  the  area  of  algo¬ 
rithms  is  to  propose  general  frameworks  for  algorithm  design  to  go  along  with  general 
categories  of  problems.  The  important  questions,  if  one  takes  the  theoretical  point  of 
view,  are 

1.  What  other  well-know  problems  are  at  least  as  easy  (hsud)  as  the  new  problem? 

2.  Is  it  possible  to  design  an  algorithm  for  the  new  problem  within  a  given  re¬ 
stricted  framework? 

The  first  of  these  is  unquestionably  useful  to  both  theory  and  prsictice.  Connections 
among  different  problems  stimulate  progress  on  all  fronts  and  allow  researchers  to  de¬ 
termine  when  progress  is  unlikely.  Connections  also  establish  a  set  of  central  problems 
on  which  researchers  can  focus  their  efforts.  For  example,  matrix  multiplication  is  the 
dominating  component  in  the  complexity  of  many  other  problems. 

The  second  question  at  first  glance  appears  to  restrict  the  search  for  algorithms 
unnecessarily.  Why  should  we  insist  on  a  dynamic  programming  algorithm,  for  example, 
when  any  algorithm  will  do?  On  the  practical  side,  implementation  effort  is  reduced 
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for  algorithms  that  conform  to  a  spedfic  framework.  Also  the  search  for  solutions  to 
new  problenos  becomes  more  focused  if  it  is  restricted  to  frameworks  that  have  already 
proved  successful  for  similar  problems.  On  the  theory  side,  restricted  frameworks  put 
lower  bounds  within  reach,  lower  bounds  that  say  any  algorithm  for  problem  P  within 
framework  F  takes  time  0(/(n)).  Such  a  lower  bound  often  leads  to  the  discovery  of 
a  faster  algorithm  when  the  restrictions  of  F  are  relaxed.  In  any  case,  either  a  lower 
boimd  or  an  upper  bound  will  lead  to  new  insights  about  the  problem  and  enhance 
our  understanding  of  the  framework.  An  important  issue,  particularly  in  the  case 
of  a  framework  as  general  as  dynamic  programming,  is  how  to  define  the  framework 
precisely.  The  definition  must  be  restrictive  enough  to  allow  for  the  possibility  of  lower 
bounds  (this  may  be  difficult  in  the  case  of  dynamic  programming),  but  flexible  enough 
to  admit  a  broad  range  of  algorithms. 

Another  issue  is  the  classification  of  NP-hard  graph  problems  by  relative  difiSculty 
using  special  classes  as  a  guide.  For  example,  hypercube  embedding,  because  it  is  NP- 
hard  even  for  trees  [54],  may  be  regarded  as  harder  than  dominating  set,  which  is  easy 
for  trees.  In  turn,  dominating  set  is  NP-hard  for  chordal  graphs,  and  thus  may  be 
regarded  as  harder  than  vertex  cover  (see  [25]).  Any  nested  sequence  of  graph  classes 
induces  a  linear  order  on  classes  of  NP-complete  graph  problems.  Two  problems  are 
in  the  same  class  if  they  are  NP-complete  for  the  same  classes  of  graphs.  Aside  from 
being  an  interesting  intellectual  exercise,  filling  in  the  details  of  such  a  classification 
scheme  may  lead  to  insights  about  the  structure  of  NP-complete  graph  problems  on  the 
theoretical  side  and  better  heuristics  and  algorithms  for  solving  them  on  the  practical 
side. 


4.  Approaches.  In  its  most  general  form,  dynamic  programming  is  a  collection 
of  rules  for  determining  the  solution  to  an  instance  of  an  optimization  problem.  The 
optimum  solution  is  either  computed  directly,  if  the  instance  is  small  enough,  or  as  a 
simple  function  of  smaller  instances  of  the  same  problem.  If  the  number  of  distinct 
smaller  instances  that  need  to  be  considered  during  the  computation  of  a  large  instance 
is  bounded  by  a  polynomial  in  the  size  of  the  large  instance,  a  polynomial  time  algorithm 
typically  results.  In  the  case  of  graph  problems,  the  smaller  instances  almost  always 
involve  subgraphs  of  the  original  graph.  Many  classes  of  graphs  can  be  defined  in 
terms  of  composition  rules  that  combine  smaller  subgraphs  into  larger  ones.  Bern  et  al 
[4]  present  a  general  theory  for  the  interaction  among  composition  rules  and  dynamic 
programming  algorithms. 

Where  composition  rules  lead  to  polynomial-time  algorithms  on  special  clfisses  of 
graphs,  it  is  sometimes  possible  to  extend  the  rules  to  general  graphs  and  obtain  algo¬ 
rithms  that  are  exponential  only  in  a  parameter  that  measures  how  often  the  original 
rules  had  to  be  extended.  For  example,  any  two-terminal  series-parallel  dag  (directed 
acyclic  graph)  can  be  defined  in  terms  of  a  unique  (up  to  associativity  of  the  operators) 
decomposition  tree.  Every  edge  e  =  (u,  w)  is  a  leaf  of  the  tree  and  has  terminals  v  and 
tv.  An  interior  node  representing  parallel  composition  joins  two  subtrees  Ti  and  T2,  each 
having  terminals  v  and  w,  into  a  single  tree  If  G\  is  the  dag  represented  by  Ti 

and  G2  the  dag  represented  by  r2,  the  dag  represented  by  T\  -t-  r2  is  G\  U  G2  (the  two 
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are  joined  only  at  vertices  v  and  tu).  A  series  composition  joins  Ti  with  terminals  u,  v 
and  Ti  with  terminals  v,  u;  into  Ti  •  T3.  Again,  the  dag  represented  by  Ti  •  Ti  is  Gi  U  Gi, 
except  now  the  two  subgraphs  only  share  vertex  v.  The  source  of  the  new  dag  is  u  and 
its  sink  is  w. 

When  traversed  bottom  up,  a  decomposition  tree  for  a  series  parallel  dag  G  cor¬ 
responds  to  a  reduction  of  G  to  a  single  edge.  At  a  parallel  composition  node  with 
terminals  v  and  w,  we  do  a  parallel  reduction^  replacing  two  edges  e,  /  joining  v  to  w 
by  a  single  edge  g  —  (v,  w).  At  a  series  composition  node  which  joins  two  subgraphs  at 
vertex  v,  we  do  a  series  reduction  at  v.  This  occurs  when  e  =  (u,  v)  is  the  unique  edge 
into  V  and  /  =  (v,  w)  is  the  unique  edge  out  of  n:  e  and  /  are  replaced  by  ^  =  (u,  w). 

Algorithms  for  problems  such  as  vertex  cover  can  be  formulated  in  terms  of  the 
reduction  sequence.  For  every  edge  e  =  (u,  w)  occurring  during  the  reduction  (e  may 
be  an  edge  of  the  original  dag  or  it  may  represent  a  whole  subgraph  which  is  joined 
to  the  rest  of  the  graph  at  the  two  endpoints  of  the  edge),  let  VC(e,fr«6w)  be  the 
cardinality  of  the  minimum  vertex  cover  for  the  subgraph  represented  by  e,  given  that 
X  =  V  or  u>  is  included  (excluded)  in  the  cover  only  if  6^  =  1  (6x  0).  If  e  is  an 

edge  of  G,  then  VG(e,00)  =  00  (no  cover  can  exclude  both  endpoints  of  the  edge), 
VG(e,01)  =  VG(c,  10)  =  1,  and  VG(c,ll)  =  2.  If  is  the  edge  resulting  from  a 
parallel  reduction  of  e  and  /,  then 

VCigXhru)  =  VCieXK)  +  VC{fXK)  -  (K  +  6w) 

(note  that  u  or  lu  is  counted  twice  if  it  is  included  in  both  coven).  U  g  results  from  a 
series  reduction  of  e  and  /,  then 

VC{g,  bM  =  min{ VG(e,  6«0)  -I-  VC{f,  Ob^),  VC{e,  6„1)  -H  VC'J,  Ib^)  -  1}. 

The  cardinality  of  the  minimum  vertex  cover  of  G,  a  two- terminal  series  parallel  dag, 
can  be  computed  by  reducing  G  to  a  single  edge  e  =  (s,  t)  and  considering  the  minimum 
value  among  the  VC{e,b,bt). 

An  arbitrary  two-terminal  dag  can  be  reduced  to  a  single  edge  if  one  additional 
operation  is  added  to  our  repertoire.  A  node  reduction  at  v  occurs  when  v  has  indegree 
or  outdegree  1  (a  node  reduction  is  a  generalization  of  a  series  reduction).  Suppose  v  has 
indegree  1  and  let  e  =  (u,  u)  be  the  unique  edge  into  v.  Let  fi  =  (v,  u>i),  =  (u,  Wk) 

be  the  edges  out  of  v.  Replace  {e,/i, . . .  ,/*}  by  {(71, . . .  ,5rfc},  where  gi  =  (u,  Wi).  The 
case  where  v  has  out-degree  1  is  symmetric  (e  =  (v,w),  fi  =  (uj.t;),  gi  =  {ui,w)). 

For  convenience,  let  Gov  denote  the  result  of  a  node  reduction  with  respect  to 
node  V,  and  let  [G]  denote  the  graph  that  results  when  all  possible  series  and  parallel 
and  parallel  reductions  have  been  applied  to  G  (this  is  well  defined  because  series  and 
parallel  reductions  obey  the  Church- Rosser  property:  the  order  in  which  reductions  are 
applied  does  not  affect  the  final  outcome  [53]).  A  dag  G  is  said  to  be  irreducible  if 
[G]  =  G. 

Let  fi{G),  the  reduction  complexity  of  G,  be  the  minimum  number  of  node  reductions 
which  are  sufficient  (along  with  series  and  parallel  reductions)  to  reduce  G  to  a  single 
edge.  More  precisely,  /i(G)  is  the  smallest  c  for  which  there  exists  a  sequence  ui, . . .  ,Uc 
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such  that  [‘  *  •  [[[G^  o  vi]  o  V3]  •  *  *  o  Ve]  is  a  single  edge.  The  sequence  vi, . . . ,  is  called 
a  reduction  sequence.  Last  year,  Wolfgang  Bein  and  I  developed  a  polynonual-tiine 
algorithm  for  computing  (t(G)  [48].  The  problem  of  computing  n{G)  is  reduced  to  a 
problem  of  finding  a  minimum  vertex  cover  in  a  transitive  auxiliary  graph  C(G). 

Suppose  Vi, . . . ,  Ve  is  a  reduction  sequence  for  G  and  let  V  be  an  arbitrary  subset  of 
{vi, . . .  ,Ve}.  Using  a  minor  modification  of  the  vertex  cover  algorithm  described  above, 
we  can  compute  the  cardinality  of  the  minimum  vertex  cover  of  G  under  the  restriction 

A  A 

that  every  vertex  of  V  must  be  included  in  the  cover  and  every  vertex  of  {wi, . . . ,  —  V" 

must  be  excluded.  Let  g\,...,gk  be  the  edges  resulting  from  a  node  reduction  at  v, 
where  v  has  indegree  1.  Uv€V,  then  VC{gi,hJ>^^)  =  KC'(c,Ayl)  4-  VC{fi,lb^i)  —  1; 
otherwise  VC{gi,bubwi)  =  VC{e^buO)  +  VC(fi,0bu,i).  Since  every  possible  subset  of 
{vi, . . .  ,Vc}  must  be  considered  as  a  choi<»  for  V,  the  result  is  an  0(m2^)  algorithm  for 
computing  vertex  cover  in  a  two-terminal  dag. 

Problems,  such  as  vertex  cover,  independent  set,  dominating  set,  clique,  and  color¬ 
ing,  which  can  be  solved  by  the  method  described  above,  are  most  often  formulated  on 
undirected  graphs.  A  biconnected  undirected  graph  can  be  turned  into  a  two-terminal 
dag  by  means  of  an  sUnutnbering,  a  numbering  of  the  vertices  in  which  vertex  1  is  adja¬ 
cent  to  vertex  n  and  each  other  vertex  has  at  least  one  lower  numbered  and  one  higher 
numbered  neighbor  [33,17].  Since  solutions  to  the  graph  problems  listed  above  can  be 
computed  separately  for  each  biconnected  component  and  then  combined,  a  natural 
definition  for  the  reduction  complexity  of  an  undirected  graph  G  is  the  maximum  over 
all  biconnected  components  C  of  G  of  the  minimum  over  all  dags  C  resulting  from 
st-numberings  of  C  of  p{C).  Undirected  graphs  of  complexity  0  are  exactly  the  undi¬ 
rected  series-parallel  graphs.  Recognition  of  undirected  graphs  of  any  fixed  complexity 
c  appears  to  be  difficult  (there  are  simple  examples  that  show  c  to  be  dependent  on 
the  st-numbering  chosen,  even  if  the  numbering  of  1  and  n  is  fixed),  but  is  known  to 
be  in  P  by  non-constructive  methods,  using  the  observation  that  the  complexity  of  an 
undirected  graph  never  increases  when  an  edge  is  deleted  or  contracted  [40,41,18].  An 
interesting  avenue  of  research  is  to  find  specific  algorithms  for  recognizing  undirected 
graphs  of  complexity  c.  It  is  also  possible  that  the  recognition  problem  is  NP-complete 
if  c  is  part  of  the  input. 

Several  practical  applications  of  reduction  complexity  are  discussed  in  the  next 
section.  A  primary  activity  of  this  research  in  the  next  year  will  be  to  learn  about  dag 
reduction  by  applying  it  to  a  wide  variety  of  problems  of  varying  difficulty.  In  some 
cases,  we  may  find  that,  whereas  dag  reduction  is  not  a  useful  algorithmic  technique, 
our  attempts  to  apply  it  have  yielded  non-trivial  insights  about  the  problem  in  question. 

As  I  have  gained  experience  with  dynamic  programming,  1  have  also  been  drawn 
to  other  problems  and  other  special  classes  of  graphs  where  dynamic  programming  is 
or  could  be  a  key  factor. 

One  of  these  is  hypercube  embedding:  given  a  graph  G  =  {V,E)  and  an  inte¬ 
ger  k,  find  a  one-to-one  mapping  h  :  V  —*  {0, ...,2*  —  1}  that  minimizes  either 
max{„,«,}g£:{d(/i(t;),/i(u;))},  the  dilation,  or  ^(^))}) /l^l’ 

age  dilation,  where  d{i,j)  is  the  Hamming  distance  between  i  and  j,  i.e.  the  number  of 
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bits  that  differ  in  th^  binary  representations.  Minimizing  dilation  is  important  when 
syndmmotts  parallel  algorithms  are  m^ped  to  the  hypercube  architecture  (in  this  case 
G  represents  the  communication  structure  of  the  algorithm).  Average  dilation,  particu¬ 
larly  a  wdghted  average,  may  be  more  important  in  the  case  of  asynchronous  algorithms. 
Average  dilation  is  also  important  to  coding  theory,  where  the  vertices  of  G  represent 
words  to  be  coded  and  edges  are  between  words  that  have  similar  meaning  (and  can 
afford  to  be  given  similar  codes).  Both  problems  are  NP-hard  even  for  trees  [54],  but 
the  special  case  of  binary  (or  other  bounded  degree  trees)  is  still  open. 

Previous  work  on  average  dilation  has  been  primarily  experimental,  concentrating 
on  various  heuristics  [6,32,10,16].  Theoretical  work  on  dilation  has  used  separators 
to  obtain  low  dilation  embeddings  for  various  special  cases  [55,36,5].  An  important 
outstanding  conjecture  is  that  any  binary  tree  can  be  embedded  with  dilation  2.  The 
best  result  obtained  so  far  yields  dilation  5  embeddings  for  arbitrary  binary  trees  [36]. 
We  are  considering  three  problems:  (1)  minimizing  dilation,  (2)  minimizing  average 
dilation,  and  (3)  minimizing  the  number  of  edges  that  have  to  be  deleted  in  order  to 
achieve  an  embedding  with  fixed  dilation  d.  All  three  are  NP-hard  for  trees,  but  open 
for  fixed-degree  trees.  The  goals  of  the  hypercube  research  are: 

1.  Settle  the  status  of  problems  (1),  (2),  and  (3)  for  binary  trees  (NP-hard  or 
polynomial). 

2.  Improve  existing  bounds  for  dilation  in  binary  trees. 

3.  Develop  heuristics  with  provably  good  performance  for  each  of  the  three  prob¬ 
lems.  0 

4.  Develop  good  strategies  for  solving  each  of  the  three  problems  and  compare 
them  experimentally. 

The  other  problem  is  what  I  call  maximum  node  coloring:  given  a  graph  G  =  (V,  E) 
and  an  int^er  k,  find  a  maximum  cardinality  (or  weight)  subset  V  of  V  such  that  the 
subgraph  of  G  induced  by  V  can  be  colored  with  at  most  k  colors.  This  problem  has 
application  to  code  optimization  (register  allocation)  [1,8,13]  and  to  imconstrained  via 
minimization  [23].  In  the  case  of  code  optimization,  the  vertices  of  G  represent  variables 
and  an  edge  means  that  the  two  variables  cannot  be  stored  in  the  same  register,  k  is 
the  number  of  registers  available  -  the  uncolored  variables  must  be  stored  in  memory. 
In  via  minimization  G  is  a  circle  graph  whose  vertices  represent  wires  to  be  routed,  an 
edge  means  that  the  two  wires  cross  and  cannot  be  routed  on  the  same  layer,  k  is  the 
number  of  layers  available  -  the  uncolored  wires  must  be  routed  by  means  of  vias,  or 
contact  cuts  firom  one  layer  to  another.  The  maximum  node  coloring  problem  is  known 
to  be  NP-hard  for  planar  graphs  [34,56]  and  for  circle  graphs  [39,42],  in  both  cases  even 
if  ib  =  2.  It  appears  to  be  solvable  for  interval  graphs  and  permutation  graphs,  using 
dynamic  programming,  if  k  is  fixed  [45].  Both  classes  are  important  because  exact 
solutions  for  graph  coloring  on  them  are  used  to  obtain  good  heuristics  for  circle  graph 
coloring  [50,52]  (note:  circle  graph  coloring  is  NP-hard  when  A:  >  4,  but  still  open  when 
it  =  3).  Interval  graphs  occur  in  the  register  allocation  problem  when  it  is  restricted  to 
straight-line  segments  of  code.  I  intend  to  learn  more  about  maximum  node  coloring  by 
attempting  to  find  polynomial  algorithms  or  NP-completeness  results  on  other  classes 
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of  gri^hs. 

An  overview  of  the  i4>proachefl  currently  being  considered  in  this  research  is  given 
by  the  following  list. 

1.  Look  at  dynamic  programming  algorithms  that  rely  on  composition  rules  for 
special  classes  of  gri^hs  and  extend  these  to  more  general  graphs,  yielding 
algorithms  that  are  exponential  only  in  some  measure  of  how  nearly  the  graph 
belongs  to  the  special  class. 

2.  Look  for  dynamic  programming  algorithms  for  problems  that  are  not  known  to 
be  solvable  in  polynomial  time,  in  some  cases  settling  for  algorithms  that  are 
exponential  only  in  a  parameter  that  is  not  likely  to  be  large  in  practice. 

3.  Look  for  dynamic  programming  algorithms  for  problems  that  are  known  to 
solvable  in  polynomial  time,  but  by  methods  that  are  not  related  to  dynar. 
programming.  Examples  include  matching  and  network  flows  (see  Section  7  foi 
more  details). 

5.  Progress.  Progress  to  date  has  been  in  three  main  application  areas.  First 
there  has  been  much  work  on  the  details  and  proofs  of  the  Stallmsmn-Bein  result  on 
node  reduction  and  its  application  to  problems  in  operations  research.  Some  of  this  work 
is  joint  work  with  Jerzy  Kamburowski,  who  independently  formulated  an  algorithm 
for  computing  the  reduction  complexity  of  a  dag.  I  have  been  able  to  show  that  his 
algorithm  is  essentially  equivalent  to  ours  and  this  has  simplified  some  of  the  proofs 
in  our  paper.  We  have  also  been  able  to  reduce  the  time  bound  of  the  algorithm  from 
O(n^)  to  0(n^*‘).  I  am  learning  about  the  various  OR  applications,  and  expect  to  be 
able  to  suggest  improvements  in  their  formulation  -  that  is,  to  interpret  them  in  a  more 
general  framework.  This  work  is  described  in  more  detail  below,  after  a  brief  description 
of  progress  on  two  other  fronts. 

In  the  area  of  hypercube  embedding,  a  student,  Woei-Kae  Chen,  and  I  have  at¬ 
tempted  to  use  dynamic  programming  to  obtidn  dilation  2  embeddings  of  binary  trees 
in  hypercubes.  The  only  success  we  have  had  so  far  is  an  algorithm  that  embeds  binary 
trees  with  an  asymptotic  average  dilation  of  2  —  [12].  This  is  not  too  promising 

because  many  simple  heuristics  routinely  obtain  average  dilations  close  to  1  in  exper¬ 
imental  studies  [9].  For  this  particular  project,  we  have  used  both  theoretical  and 
experimental  aq>proaches  to  attack  the  problem.  Many  of  our  failed  dynamic  program¬ 
ming  algorithnos  were  based  on  conjectures  whose  smallest  coimterexampie  had  32  or 
more  nodes  (in  one  case,  only  one  counterexample  of  size  32  existed),  so  computational 
trials  based  on  exhaustive  search  were  a  valuable  resource.  We  are  currently  comparing 
a  variety  of  heuristics,  including  simulated  annealing,  in  an  experimental  study  [11]. 
Our  methodology  is  similar  to  that  of  Johnson  et  al  [26]. 

My  work  with  Tom  Hughes  and  Wentai  Liu  in  the  area  of  via  minimization  has  led 
to  a  submitted  paper  [49]  and  an  NSF  proposal.  I  anticipate  a  lot  of  future  work  on 
algorithms  and  heuristics  for  various  subproblems  related  to  UVM-based  routing.  Some 
preliminary  work  on  max  node  coloring  in  special  classes  that  are  subclasses  of  circle 
graphs  is  likely  to  be  useful  in  developing  efficient  heuristics.  The  work  on  UVM  based 
routing  has  also  led  me  to  some  ideas  on  a  constrained  planar  embedding  problem  that 
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appears  to  be  difficult,  but  many  special  cases  can  be  solved  in  polynomial  time  [46]. 
Dynamic  programming  may  be  a  factor  in  solving  the  general  version  of  this  problem, 
if  an  efficient  algmithm  exists. 

I  am  in  the  process  of  writing  a  paper  with  Salah  E.  Elmaghraby  and  Jerzy  Kam- 
burowski  on  44>plication8  of  dag  reduction  (15).  The  paper  features  some  simplifications 
of  the  definitions  and  proofs  in  the  Stallmann-Bein  piper  as  well  as  descriptions  of 
several  applications  to  problems  in  operations  research.  Consider,  for  example,  the 
following  three  problems  on  a  two-terminal  acyclic  network  G  =:  (V,  E)  in  which  the 
weight  of  each  edge  c  is  a  random  variable  X,  governed  by  a  probability  distribution 
function  F,: 

1.  computation  of  the  pdf  of  project  completbn  time,  where  the  network  is  inter¬ 
preted  as  a  PERT  network  (activity-on-arc)  and  edge  weights  represent  dura¬ 
tions  of  the  activities, 

2.  computation  of  the  pdf  of  the  length  of  the  shortest  route  from  source  to  sink, 
where  the  network  is  a  transportation  network  and  weights  represent  travel 
time,  and 

3.  computation  of  the  reliability  of  the  network  given  that  weights  are  either  0  or 
1  (0  if  the  edge  fails,  1  if  it  remains  intact). 

Let  V  be  the  set  of  all  source-sink  paths  in  G.  Then  the  solutions  to  the  three  problems 
may  be  formulated  as  follows: 

1.  r  ss  maxp^v  E«6P  Xe 

2.  Xr  s  winp^p 

3.  R  =  maxpe^  IleeP 

The  pdf’s  of  random  variables  T,  X,  and  R  can  be  computed  using  dag  reduction  as 
follows  (we  compute  the  pdf  associated  with  each  edge  introduced  during  the  reduction). 
Suppose  g  is  the  parallel  reduction  of  e  and  /.  Then,  in  the  case  of  T  and  R,  Xg  = 
Taax{Xf,Xj}  and  Fg{x)  =  Ft{x)Fj{x).  In  the  case  of  X,  Xg  =  min{Xe,A'/}  and 
Fg{x)  =  1  —  (1  —  Re(x))(l  —  F}{x)).  Now  let  g  be  the  result  of  a  series  reduction  of  c  and 
/.  In  the  case  of  T  and  X,  A,  =  if*  +  -X"/  and  F,(x)  =  F*  •  F/(z)  =  JJ*  Fe(x  -  y)dFj{y). 
In  the  case  of  R,  Xg  =  min{Ae,  A/}  and  F,(x)  =  1  -  (1  -  Fe(i))(l  -  F/(i)). 

Node  reductions  are  complicated  by  the  fact  that  the  pdf’s  of  the  gCa  are  not 
independent.  Let  gif-,gk  be  the  edges  resulting  from  a  node  reduction  (where  v  has 
indegree  1  -  the  other  case  is  symmetric)  of  e  and  /i, . . . ,  A.  We  reflect  the  dependence 
among  the  Fg^  by  computing  for  each  gi  the  conditional  pdf  F'.(z;t),  which  is 
given  that  the  value  of  Ae  is  fixed  at  t.  The  final  result  for  the  network  must  then  be 
integrated  over  all  possible  values  of  t  with  respect  to  dFe{t). 

If  the  number  of  distinct  values  taken  on  by  edge  weights  is  a  fixed  constant  17,  the 
total  time  required  to  compute  Prob{Q  <  x)  for  a  given  z,  where  Q  is  one  of  T,  X,  or  R, 
is  0(ml7®),  where  c  is  the  reduction  complexity  of  G  (in  the  case  of  R,  (7  =  2;  a  simpler 
formulation  is  given  in  [48]).  It  appears  that  for  these  problems,  as  well  as  for  many 
other  problems,  the  solution  on  autonomous  subnetworks,  essentially  triply  connected 
components  of  the  dag,  can  be  computed  independently.  Therefore,  the  correct  measure 
of  reduction  complexity  should  be  the  maximum  over  all  subnetworks  of  the  complexity 
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defined  above.  This  would  be  equally  easy  to  compute;  in  fact,  autonomous  subnetworks 
correspond  to  ommected  components  of  the  auxiliary  graph  [48]. 

The  above  results  for  PERT  networks  rely  on  an  activity-on-arc  representation, 
which  is  the  one  used  most  often  in  computations.  However,  for  formulating  prob¬ 
lems,  the  activity-on-node  representation  is  more  natural,  since  it  does  not  require  the 
introduction  of  dummy  activities.  The  problem  of  minimizing  the  number  of  dummy 
activities  when  translating  from  activity-on-node  to  activity-on-arc  representation  is 
NP-hard  [30].  Kamburowski  has  conjectured  that  there  is  a  polynomial  algorithm  for 
translating  an  activity-on-node  network  (?  to  an  activity-on-arc  network  G*  that  has 
Tninimum  reduction  complexity  among  all  such  G'.  This  would  allow  us  to  extend 
our  algorithmic  results  to  scheduling  problems  with  precedence  constraints  (precedence 
constraints  are  usually  represented  by  activity-on-node  networks).  A  student,  David 
Michael,  is  working  on  this  conjecture  for  his  PhD  thesis. 

Other  applications  of  reduction  complexity  include  conditional  Monte  Carlo  sam¬ 
pling,  bounds  on  expected  values  of  random  variables  in  stochastic  networks,  and  dy¬ 
namic  programming  approaches  to  optimal  resource  allocation  in  PERT  networks. 

Another  area  of  progress  has  been  in  improving  time  bounds  for  computing  the 
auxiliary  graph  C(G),  and  for  computing  a  minimum  vertex  cover  of  a  transitive  gnq>h. 

The  latter  has  been  shown  to  be  equivalent  to  bipartite  matching  [27],  improving  the 
time  bound  from  O(n^)  to  0(n^'‘).  A  dag  whose  transitive  closure  is  C(G)  can  be 
computed  in  time  O(n^)  [47].  It  is  also  easy  to  show  that  computing  C(G)  is  at  least  as 
hard  as  computing  the  transitive  closure  of  (?,  and  that  computing  /t(G)  is  at  least  as 
hard  as  bipartite  matching,  hence  all  our  bounds  are  tight,  barring  any  improvements 
in  the  time  bounds  for  transitive  closure  or  matching. 

6.  Research  Directions.  The  primary  research  directions  suggested  by  this  project 
have  already  been  discussed  in  Section  4.  One  important  issue  that  has  been  neglected 
so  far,  however,  is  that  of  efficient  parallel  algorithms.  Dynamic  programming  algo¬ 
rithms  based  on  tree  structured  decomposition  schemes,  such  as  those  for  series-parallel 
graphs,  can  be  efficiently  parallelized  using  tree  contraction  [35,22].  Efficient  dynamic 
programming  algorithms  for  systolic  arrays  and  meshes  have  also  been  proposed  [29]. 
These  observations  suggest  two  important  lines  of  research. 

1.  To  what  extent  can  dynamic  programming  schemes  that  yield  sequential  polynomial¬ 
time  algorithms  be  adapted  to  parallel  models,  such  as  PRAM,  systolic  linear 
array,  or  mesh? 

2.  Are  there  universal  schemes,  like  dynamic  programming,  that  lead  to  efficient 
parallel  algorithms  for  problems  on  special  classes  of  graphs,  or  for  general 
graphs? 

The  most  promising  approach  is  to  attempt  to  generalize  existing  parallel  algorithms  for 
problems  that  have  been  solved  sequentially  by  dynamic  programming.  Any  patterns 
that  emerge  should  be  applied  to  problems  for  which  no  efficient  parallel  algorithms 
are  known.  This  is  just  a  general  idea  and  is  only  being  pursued  in  a  limited  way  by 
this  project.  I  refer  here  to  the  model  for  on-line  systolic  graph  algorithms  mentioned 
in  the  proposal:  a  model  of  computation  for  graph  problems  in  which  the  processing 
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unit  is  «ble  to  store  a  small  fixed  number  of  data  items  per  vertex  and  is  able  to  read 
the  edges  of  the  gri4>h  sequentially  as  many  times  as  is  required  to  solve  the  problem 
(each  reading  of  the  edges  is  called  a  pass).  The  processor  itself  may  be  a  linear  array, 
a  mesh,  or  a  random  access  machine  (either  sequential  or  parallel).  It  appears  that 
while  the  capabilities  of  the  processor  may  affect  the  running  time  per  pass,  the  total 
number  of  passes  is  not  affected.  Problems  such  as  connectivity  and  biconnectivity 
can  be  soli^  in  0(1)  passes,  in  linear  time  on  an  array  and  in  almost  linear  time  on 
a  sequential  machine  (see  [43,44,51]).  Other  problems  that  might  be  solvable  in  0(1) 
passes  (though  no  algorithm  is  known  at  this  time)  are  finding  a  minimum  spanning 
tree  and  finding  a  shortest  path  between  two  specific  vertices.  The  only  progress  I  can 
report  is  not  the  result  of  my  work;  an  on-line  systolic  algorithm  for  minimum  spanning 
trees  by  Huang  [24]. 

Since  I’m  relatively  new  to  this  work.  I’m  not  sure  I  can  safely  suggest  any  ap¬ 
proaches  that  should  not  be  pursued.  Most  of  the  approaches  I’ve  suggested  are  brute- 
force,  seat-of-the-pants  type  research,  requiring  few  deep  mathematical  results.  This  is 
not  to  suggest  that  the  work  requires  no  mathematical  background  or  sophistication. 
But,  as  is  often  the  case  with  combinatorial  methods,  the  mathematics  gets  made  up 
as  you  go  along,  mathematical  ideas  emerge  as  the  essence  of  the  problem  is  more 
clearly  understood.  It  is  difficult  to  predict  in  advance  which  mathematical  tools  will 
be  required  for  the  task  at  hand.  Graph  theory  and  combinatorics  are  general  realms 
in  which  to  look  for  tools,  but  I’ve  found  that  it’s  difficult  to  keep  up  with  all  recent 
results  in  these  areas  that  may  be  relevant  to  even  one  computer  science  problem.  Com¬ 
munication  with  people  who  are  knowledgeable  in  combinatorics  and  graph  theory  is 
essential  when  the  underlying  combinatorial  problem  has  been  abstracted  out  of  an 
applied  problem. 

7.  Grand  Challenge.  Two  of  the  hardest  problems  that  are  known  to  be  solv¬ 
able  in  polynomial  time  are  graph  matching  and  network  flows.  Until  recently,  all 
known  efficient  algorithms  for  both  problems  have  used  some  form  of  augmenting  path 
search.  The  advent  of  “preflow-push”  algorithms  for  network  flows  [20]  suggests  that 
the  flow  problem  is  amenable  to  more  localized  strategies.  New  algorithms  for  match¬ 
ing  have  been  motivated  by  parallel  models  of  computation  and  have  centered  on  the 
computation  of  symbolic  determinants  using  randomized  algorithms  [28].  Deterministic 
algorithms  based  on  determinants  exist  for  special  classes  such  as  planar  graphs  (fol¬ 
lows  directly  from  ideas  outlined  in  [3])  and  strongly  chordal  graphs  [14].  However,  for 
planar  graphs  it  is  not  known  how  to  compute  the  actual  matching  deterministically; 
only  the  problem  of  determining  whether  a  perfect  matching  exists  has  been  solved. 

I  believe  that  dag  reduction  may  be  helpful  in  the  development  of  alternate  al¬ 
gorithms  for  network  flows  and  matching.  The  grand  challenge  is  to  find  algorithms 
that  are  either  simpler,  more  efficient,  or  more  easily  paredlelizable  than  existing  algo¬ 
rithms  for  network  flows  and  matching  (note:  the  network  flow  problem  is  known  to  be 
log-space  complete  for  P  [21],  hence  a  polylog-time  parallel  algorithm  is  unlikely). 

The  flow  problem  for  dags  is  at  least  as  hard  as  that  for  general  directed  graphs 
[38]  (it  would  be  an  interesting  exercise  to  try  to  extend  this  construction  to  other 
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problems,  such  as  min  cost  flows  or  network  reliability).  For  series-parallel  dags,  the 
netwwk  flow  problem  can  be  solved  in  linear  time  using  the  following  algorithm  to 
confute  AfF'(e),  the  maximum  flow  for  the  subgraph  represented  by  the  edge  e,  for 
every  edge  e  that  occurs  during  a  reduction  ofG.  If  p  is  the  result  of  a  parallel  reduction 
of  e  and  /,  MF{g)  —  MF(e)  +  MF{f);  in  the  case  of  a  series  reduction,  MF{g)  = 
min{AfF(c),AfF(/)}. 

I’m  currmtly  looking  at  how  these  ideas  can  be  extended  to  more  general  dags.  One 
key  idea  is  that  it’s  possible  to  reduce  an  arbitrary  dag  tising  only  node  reductions  on 
nodes  with  indegree  1  (such  a  reduction  sequence  may  not  be  minimum,  but  that’s  not 
important  to  what  follows).  In  a  node  reduction  of  v,  where  e  =  (u,  v)  is  the  unique  edge 
into  V,  the  key  issue  is  how  to  distribute  the  ciqMtcity  of  e  among  the  p^’s  that  result  from 
the  node  reduction.  In  the  preflow-push  model  this  translates  into  a  decision  about  how 
to  distribute  the  excess  at  v  among  the  edges  leading  away  from  v.  A  generalization  of 
the  series-parallel  decomposition  tree,  called  a  factoring  [48],  can  be  used  to  guide  the 
excess  toward  the  sink.  In  general,  we  push  as  much  of  the  excess  as  possible  through 
an  arbitrary  edge  out  of  v  and  keep  pushing  in  a  depth-first  manner  toward  the  sink. 
There  are  two  differences  with  the  standard  preflow-push  approach.  First,  whenever 
we  encotmter  a  vertex  w  that  is  not  dominated  by  v,  we  push  the  excess  from  other 
parts  of  the  dag  toward  w  before  pushing  any  flow  out  of  w  (thus  we  guarantee  the 
maximum  possible  excess  at  w  before  pushing  flow  out  of  w);  the  factoring  iq>pears  to 
be  a  valuable  tool  for  guiding  this  excess.  Second,  rather  than  pushing  flow  backwards 
when  we  find  that  we’re  unable  to  push  the  excess  at  w  forward,  we  reroute  the  excess 
by  backtracking  to  a  choice  point  x  (vertex  for  which  a  node  reduction  is  required), 
which  does  not  dominate  w;  again,  factoring  appears  to  help  with  finding  the  right  x. 
This  is  still  in  the  intuitive  stage  and  may  not  lead  to  anything,  but  I  feel  that  it’s 
worth  pursuing.  The  existing  flow  algorithms  do  not  appear  to  take  advantage  of  the 
special  structure  of  dags. 

Dag  reduction  may  also  play  a  role  in  obtaining  simpler  algorithms  for  maximum 
matching.  The  central  issue  here  is  whether  we  can  restrict  the  number  of  ways  to  match 
the  nodes  that  are  removed  by  node  reduction.  This  is  a  special  case  of  a  more  general 
issue  raised  for  graph  problems  by  Lakshmipathy  and  Winklmann  [31]:  given  a  graph 
G  =  (Vf  E),  a  decision  problem  P  defined  on  G,  and  subsets  V\  and  V%  of  V,  such  that 
V^UV^  =  V  and  mnm  =  s,  how  many  bits,  as  a  function  of  s,  does  a  machine  knowing 
only  the  subgriq>h  induced  by  Vx  need  to  transmit  to  a  machine  knowing  the  subgraph 
induced  by  Vj  so  that  the  second  machine  can  give  the  correct  answer  for  P{G)t  For 
many  NP-complete  problems,  a  lower  bound  exponential  in  s  can  be  shown.  Many 
problems  in  P  have  upper  bounds  polynomial  in  s.  The  communication  complexity  of 
bipartite  matching  in  this  model  is  open.  Resolution  of  the  communication  complexity 
of  matching  may  lead  to  linear-time  algorithms  for  planar  matching,  NC  algorithms  for 
general  matching,  simpler  sequential  algorithms  for  general  matching,  and  a  resolution 
of  the  red-green  matching  problem  (in  red-green  matching,  the  edges  are  colored  red  or 
green  by  the  input  and  the  object  is  to  find  a  perfect  matching  with  a  specific  number 
of  red  edges;  this  problem  was  first  posed  by  Papadimitriou  and  Yannakakis  [37],  in 
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coiinecti<m  with  cmiatrained  spanning  tree  problems,  and  was  shown  to  be  in  RNC 
by  Karp  et  al  [28];  no  deterministic  polynomial  time  algorithm  for  it  is  known).  An 
«cponential  lower  bound  would  be  quite  surprising,  and  would  suggest  that  no  ‘‘simple^’ 
algorithms  for  matching  exist. 

8.  Researdi  Transitions.  Since  mudb  of  this  research  is  motivated  by  practical 
applications  and  I  am  in  direct  contact  with  people  working  on  the  practical  issues, 
transition  is  often  my  primary  research  task.  The  most  difficult  transitional  issue  for  me 
has  not  been  one  of  communicating  theoretical  results  to  practitioners  - 1  do  that  all  the 
time,  most  often  for  theoretical  results  that  are  not  my  own.  The  problem  has  been  one 
of  translating  transitional  work  into  publishable  papers.  I  become  familiar  enough  with 
applications  to  understand  the  underlying  theoretical  issues,  but  not  familiar  enough  to 
understand  all  the  history,  lore,  and  lingo  of  the  application,  i.e.  to  be  able  to  publish 
results  in  a  journal  devoted  to  the  application.  Joint  papers  with  applications  people  are 
a  possibility,  and  Pm  doing  some  of  that  now.  It’s  sometimes  hard  for  me  to  resist  the 
temptation  to  nitpick  at  everything  that  doesn’t  have  a  solid  theoretical  foundation,  and 
the  people  I  work  with  are  often  slowed  down  by  my  participation  in  p^ers,  proposals, 
and  on  PhD  thesis  committees.  The  nitpicking  is  important  and  usually  leads  everyone 
involved  to  a  better  understanding  of  the  central  issues.  More  journals  and  conferences 
devoted  to  the  interface  between  theory  and  practice  might  help,  and  there’s  already  a 
trend  in  that  direction. 

9.  Technological  Impacts.  This  research  would  be  aided  by  software  tools  for 
animation  of  graph  algorithms.  As  a  step  in  that  direction,  I  have  proposed  the  fol¬ 
lowing  independent  study  project  for  a  student.  Design  and  implement  a  system  that 
allows  user  to  input  directed  or  undirected  graphs  using  a  mouse  pointing  device  in 
conjimction  with  X-Windows.  The  system  should  support  such  standard  operations 
as  adding  a  vertex,  creating  an  edge  between  two  vertices,  deleting  an  edge  or  vertex, 
labeling  an  edge  or  vertex,  and  moving  a  vertex  to  a  different  screen  position.  Out¬ 
put  should  be  a  graph  in  adjacency  list  or  adjacency  matrix  format,  accessible  to  an 
algorithm  implementation.  Algorithm  animation  systems  for  graph  algorithms  exist 
(see,  for  example,  [7]),  but  require  tremendous  programming  effort  to  custom  tailor  for 
any  specific  application.  My  goal  is  something  simple  and  modest,  with  few  bells  and 
whistles  but  lots  of  flexibility.  In  particular,  I  need  to  be  able  to  add  features  as  the 
need  for  them  arises  (rather  than  working  around  complicated  features  of  an  existing 
system  to  meet  my  needs). 

The  work  on  hypercube  embedding  has  already  benefited  from  having  a  reasonably 
powerful  CPU  available  for  exhaustive  searches  and  simulated  annealing  trials.  The 
SUN  workstation  purchased  with  funds  from  this  project  has  been  a  major  help  in  this 
regard. 

10.  Societal  Issues.  1  would  like  to  begin  this  section  by  thanking  the  Office  of 
Naval  Research  for  their  support.  There  are  three  ways  in  which  this  ONR  grant  has 
made  a  major  difference  in  my  research  career.  First,  it  enabled  me  to  entice  one  of 
our  better  graduate  students  to  stay  on  for  a  PhD,  rather  than  quitting  with  a  master’s 
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degree.  I  would  like  to  emphasize  that  support  of  good  students  should  be  the  primary 
motivation  for  obtaining  research  money.  The  more  research  money  can  be  fuimeled 
directly  to  students,  the  better  off  we  will  all  be  in  the  long  nm.  This  siiggests  that  it’s 
better  to  support  many  smaller  projects  rather  than  fewer  big  ones.  Second,  the  ONR 
grant  has  enabled  me  to  purchase  a  Sun  workstation  for  computational  experiments, 
editing,  and  preparing  papers.  Though  initially  my  productivity  was  almost  halted 
while  I  figured  out  what  kind  of  workstation  would  meet  my  needs  and  how  to  use 
the  thing  once  I  got  it,  it  has  been  a  tremendous  help  in  the  more  recent  past.  A 
major  problem  at  this  university  is  lack  of  software  support  personnel  - 1  waited  several 
months  for  someone  to  install  key  pieces  of  software  (e.g.  LaTeX  and  X- Windows)  on 
my  station  and  finally  had  to  do  it  myself.  Staff,  both  secretarial  and  technical,  should 
also  be  a  category  of  high  priority;  without  staff,  equipment  does  not  get  optimiun  use 
and  researchers  spend  too  much  of  their  time  on  routine  tasks.  Finally,  the  ONR  grant 
has  been  a  "shot  in  the  arm”  to  my  self-esteem  as  a  researcher.  It’s  sometimes  hard 
to  feel  theoretical  work  is  worth  anything  in  a  place  that  puts  great  emphasis  on  large 
practical  projects  with  specific  missions.  If  basic  research  is  to  survive  and  continue  to 
contribute  to  our  economic  health  (by  providing  practically  applicable  results  and  by 
training  the  next  generation  of  scientists),  funding  agencies  need  to  pay  more  attention 
to  the  quality  and  enduring  nature  of  the  research  being  performed  rather  than  its 
immediate  applicability  to  practical  problems. 
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Abstract 

The  goal  of  the  adaptive  database  project  at  the  University  of  Colorado  is  to  develop 
techniques  which  will  make  database  systems  useful  for  newer  applications,  such  as 
engineering  design.  In  such  cases,  there  is  a  need  to  support  complex,  ctnnputed  data,  as 
well  a  need  to  hand-tailor  a  database  system  to  suit  specific  processing  requirements,  for 
example,  version  support  and  document  management.  Two  different  experimental  sys¬ 
tems,  one  addressing  each  of  these  concerns,  are  under  construction.  The  algorithms  and 
techniques  developed  for  these  systems  are  intended  to  help  relieve  the  advanced  data¬ 
base  user  from  the  highly  constrained  mechanisms  which  traditional  database  manage¬ 
ment  systems  provide. 


1.  Background 

Traditionally,  database  systems  were  used  by  business  programmers,  and  their 
needs  were  at  least  perceived  to  be  rather  simple.  Data  in  the  real  worid  spans  a  wide 
spectrum  of  complexity,  from  highly  unstructured  (like  text)  to  highly  structured  (like 
airplane  designs).  In  data  processing  environments,  data  is  typically  represented  only  in 
a  narrow  band  of  this  spectrum.  All  data  is  seen  as  being  tightly,  yet  simply,  structured. 
Further,  most  transactions  against  the  danbase  are  submitted  in  batch  mode.  The  goal  is 
merely  to  support  the  fast  retrieval  of  large  numbers  of  similar,  simpiy-stroctured 
records.  As  a  result,  conventional  database  systems  provide  very  little  in  the  way  of 
abstraction,  and  in  particular  cannot  effectively  represent  data  whose  internal  structure  is 
either  highly  structured  or  highly  unstructured. 


In  recent  years,  a  new  generation  of  potential  database  users  has  enoerged.  This 
includes  software  engineers,  VLSI  and  printed  circuit  board  designers,  aircraft  and  CAD 
engineers,  as  well  as  those  involved  in  office  automation.  These  individuals  wish  to  store 
and  manipulate  many  forms  of  data,  in  particularly,  highly  structured  objects.  (There  is  a 
need  to  represent  unstructured  data,  specifically  text,  as  well,  but  this  research  project 
does  not  address  this  issue.)  Further,  engineers  often  wish  to  manipulate  data  in  an 
interactive  environment  In  sum,  newer  database  users  have  a  need  for  all  the  amenities  a 
database  system  provides  -  such  as  concurrency,  serializability,  transaction  management 
rollback  and  recovery  -  but  in  an  interactive  design  mode.  Since  traditional  database  sys¬ 
tems  do  not  suit  these  needs,  many  researchers  are  examining  the  numerous  problems 
related  to  this  grand  challenge. 

2.  Research  Objectives  and  Issues 

Clearly,  the  goal  of  providing  database  suppon  for  interactive  design  users  is  gigan¬ 
tic.  New  data  models,  storage  and  access  mechanisms,  query  languages,  user  interfaces, 
and  many  other  tools  are  needed.  In  this  project  we  focus  on  two  specific  problems  and 
use  a  cocjnon  philosophical  approach  in  attacking  each  of  them.  (Dur  first  area  of  con¬ 
centration  involves  the  support  of  computed  data.  In  a  design  system,  as  opposed  to  a 
data  processing  system,  there  is  a  vast  amount  of  tightly  interconnected  computed  data.  A 
design  for  an  airplane  includes  highly  interrelated  data;  changing  one  part  of  the  design  is 
likely  to  have  effects  on  many  other  aspects.  Further,  it  must  be  accessed  quickly,  as 
designers  work  in  real  time.  Our  second  focus  is  on  a  broader  issue,  that  of  allowing 
advanced  users  to  cleanly  integrate  into  one  database  environment  a  variety  of  complex 
tools.  For  example,  in  a  software  development  system  used  by  software  engineers,  the 
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database  must  interact  with  versioning,  configuration,  and  report  systems. 


We  are  q^noaching  each  of  diese  tasks  finom  the  perspective  of  adaptability.  This 
means  that,  unlike  existing  database  systems,  the  DBMS  is  not  rigid.  In  our  first  research 
effcnt,  we  are  focusing  on  the  ability  of  the  system  to  adt^t  itself  at  the  physical  level; 
ccnnputed  data  is  managed  in  a  way  that  allows  the  DBMS  to  learn  fimn  past  usage 
experience  and  rearrange  the  way  it  processes  updates.  This  is  crucial  in  minimizing  the 
potentially  exponential  costs  of  calculating  ctm^uted  data.  In  the  second  effort,  we 
focus  on  the  ability  of  the  database  user  to  adapt  the  system  to  suit  his  or  her  needs  -  at 
the  ctmceptual  level.  This  is  important,  as  engineering  applications  vary  dramatically  in 
their  requirements,  and  often  require  very  specialized  tools. 

3.  Approaches  and  Progress 

The  two  projects  described  above  are  called  Cactis  and  A  La  Carte.  Cactis  has 
resulted  in  the  development  of  parallel  algorithms  for  the  maintenance  of  computed  (or 
derived)  data.  These  algorithms  are  based  on  attributed  graphs  and  dramatically  reduce 
the  amount  of  I/O  necessary  to  keep  complex  engineeiing  database  entities  up  to  date.  A 
La  Carte  uses  the  approach  of  abstracting  the  database  management  system  up  another 
leveL  resulting  in  the  design  of  a  database  generator,  such  a  system  is,  as  a  result, 
designed  to  be  much  more  tailorable.  The  main  problem  lies  in  doing  this  in  a  fashion 
which  does  not  require  vast  amounts  of  low-level  programming. 

Both  of  these  projects  also  share  another  common  philosophic  approach,  besides 
one  of  adaptability.  They  both  attempt  to  integrate  two  directions  which  have  been 
prominent  in  the  database  research  community  -  behavioral  and  structural  (or  "semantic”) 
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object-oriented  modeling.  (Behavioral  object-oriented  modeling  is  often  simply  refened 
to  by  the  tenn  objected-oriented.)  This  has  allowed  the  suppmt  of  data  objects  which  are 
both  structurally  complex  and  dynamic.  This  is  crucial  in  supporting  emerging  engineer¬ 
ing  {qtplicaticnis.  Below,  we  discuss  both  projects,  first  Cactis,  then  A  La  Carte. 

3.1.  Cactis 

Consider  an  engineering  design  qjplicadon  familiar  to  all  of  us:  software  develop¬ 
ment  and  reuse.  In  every  phase  of  the  software  life-cycle,  we  see  a  need  for  doived  data. 
Examples  include  the  following  data  relatitMtships:  the  dependency  between  a  source 
module  and  the  corresponding  object  module;  the  derivation  of  a  load  module  from  a 
number  of  object  nxxlules;  and,  the  relationships  between  a  set  of  software  modules  and 
the  associated  documentation,  requirements,  bug  reports,  fix  reports,  and  project  mile¬ 
stones.  In  each  case,  if  one  piece  of  data  changes,  others  are  Ukely  to  be  changed  as  a 
direct  consequence. 

With  traditional  database  systems,  this  sort  of  derived  data  must  be  maintained  by 
the  application  software  or  directly  by  end  users  -  typically  with  a  mechanism  known  as 
triggers.  This  introduces  problems.  Programmers  are  not  likely  write  code  that  is  port¬ 
able  firom  one  software  environment  to  another.  Also,  if  computed  data  is  maintained 
directly  by  the  DBMS,  then  it  may  be  managed  in  a  much  more  efficient  and  correct 
fashion.  Cactis  [7,9]  is  designed  to  support  computed  data  in  a  highly  efficient  manner, 
and  to  do  so  in  a  consistent  fashion.  Triggers,  on  the  other  hand,  must  be  hand-coded  by 
the  user  and  are  difficult  to  reuse.  Even  more  significantly,  as  a  trigger  mechanism  is 
likely  to  operate  in  a  first  come,  first  severed  basis,  no  attempt  is  made  to  optimize  their 
execution.  In  general,  if  several  trigger  sequences  all  lead  to  the  same  piece  of  computed 
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data,  it  could  be  updated  an  exponential  anaount  of  time,  with  respect  to  the  nundaer  of 
trigger  paths  to  the  data  item.  A  prototype  Cactis  system  has  been  implemented,  in  order 
to  provide  a  basis  for  the  experimentation  with  and  evolution  of  the  underlying  algo- 
ritiuns.  In  particular,  substantial  experiments  have  been  performed  in  mder  to  illustrate 
that  die  techniques  developed  are  useful  for  engineering  databases.  The  research  is  being 
conducted  in  conjunction  with  Scott  Hudson  of  the  University  of  ArizcMUL 

Cactis  represents  a  database  as  an  attributed  graph,  and  uses  an  incremental  graph 
update  algorithm.  It  also  is  self-adaptive,  in  that  it  learns  from  past  experience  and 
adjusts  both  process  scheduling  and  data  clustering  on  disk  to  minimize  the  VO  cost  of 
maintairung  conq)uted  data.  We  have  run  extensive  performance  tests  on  Cactis,  illus¬ 
trating  substantial  savings  when  the  system  is  used.  The  potentially  expcMiential  behavior 
of  triggers  has  been  reduced  to  linear  cost 

Also,  several  con^nents  of  a  software  environment  including  a  "Make”  [5]  facil¬ 
ity,  a  critical  path  tool,  and  a  bug  report  system  have  been  built  on  top  of  Cactis.  Furdier, 
the  Arcadia  software  environment  project  [6, 10, 12]  has  made  some  use  of  Cactis. 

Cacti  [8]  is  a  distributed  version  of  Cactis,  and  is  currently  under  construction.  It  is 
targeted  for  a  local  network  of  Sun  workstations,  and  is  motivated  by  the  fact  that 
software  design  teams  often  work  in  distributed,  interactive  environments.  The  imple¬ 
mentation  of  the  system  is  being  greatly  facilitated  by  the  fact  that  the  graph  algorithm  in 
Cactis  is  naturally  parallel,  thus  making  it  easy  to  adapt  it  to  a  distributed  environment 
In  keeping  with  the  self-adaptive  nature  of  Cactis,  the  new  system  uses  usage  statistics  to 
replicate,  migrate,  and  recluster  data  around  the  network. 
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3.2.  A  La  Carte 


A  La  Carte  [2]  is  in  its  early  stages,  and  addresses  much  higher-level  issues  than 
Cactis  or  Cacti.  The  project,  which  is  being  conducted  in  conjunction  with  Colorado 
PhD  students  Pam  Drew  and  Jonadum  Bdn,  was  modvated  by  the  lesson  that  Cactis  is 
still  a  very  low  level  tool,  and  that  many  problems  arise  when  trying  to  integrate  various 
software  environment  tools  within  a  Cactis  application.  Again,  a  prototype  is  under 
development,  so  that  real  experiments  can  be  performed  to  validate  and  evolve  the  tech¬ 
niques  under  design. 

The  system  uses  mixins  and  multiple  inheritance  to  allow  an  engineer  to  select  both 
database  facilities  and  software  environment  capabilities.  For  example,  die  designer  of  a 
software  enviromnent  may  choose  an  apprc^priate  cmicurrency  control  cation  and  cluster¬ 
ing  mechanism,  as  well  as  a  version  facility,  a  document  management  mechanism,  and  a 
configuration  tool.  A  La  Carte  puts  them  all  together  in  tme  system,  using  a  mediod 
integration  technique.  It  thus  is  very  similar  in  spirit  to  Exodus  [3]  and  Genesis  [1];  a 
significant  difference  is  that  A  La  Carte  is  a  less  aggressive  project,  and  is  oriented 
mostly  toward  examining  the  qiprcqniate  mechanisms  finr  resolving  conflicts  when  mix¬ 
ing  in  complex  software  methods. 

4.  Research  Directions 

There  are  many,  many  other  aspects  of  database  support  for  newer  complex  data 
that  must  be  investigated.  Of  prime  importance  is  the  representation,  in  a  coordinated 
fashion,  of  different  levels  of  structured  data,  all  the  way  from  text  to  video  to  sound  to 
graphical  images  to  layout  diagrams.  A  few  researchers  are  working  on  multi-media 


database  peogects  oriented  toward  solving  this  jnoUem  [4].  In  particular,  many  engineer* 
ing  q^lkations  have  very  demanding  data  modeling  lequiiements.  Gxisider  PCB 
boards;  the  job  of  representing  the  wiring  jnoblem  is  immense,  and  current  wiring 
software  represents  the  board  as  an  unstructured  file,  with  all  the  senuuitics  of  the  board 
embedded  in  an  ad  hoc  fashion  in  die  application  software.  We  hc^  that  Cactis  and 
Cacti  will  provide  some  help  in  maintaining  the  relationships  between  various  forms  of 
data  As  another  exan^le,  if  a  CAO  image  changes,  the  documentation  describing  it  is 
likely  to  change  as  well.  A  La  Carte  should  also  provide  some  insight,  in  terms  ci  pro¬ 
viding  a  mechanism  for  integrating  the  wide  variety  of  tools  needed  to  support  multiple 
forms  of  data. 

Engineers  also  want  to  manipulate  conqiuted  data  in  real-time,  using  interactive, 
staged  transactions.  Traditionally,  DBMS's  have  been  tailored  toward  the  support  of 
batched  transactions.  And,  in  the  future,  when  a  design  error  occurs,  it  will  not  be 
sufficient  to  blindly  rollback  the  entire  transaction  (which  may  have  taken  days  or 
weeks).  For  exao^le,  if  a  piece  of  source  code  is  changed,  this  might  automatically 
affect  documentation,  milestones,  bug  report,  test  data,  and  test  executions.  If  one  of 
these  executions  abnormally  terminates,  the  software  designer  does  not  want  the  entire 
transaction  -  including  the  source  code  changes  -  backed  out  Very  fine-grained  control 
of  how  the  transacticm  is  indeed  backed  out  is  needed.  Some  aspects  of  staged  recovery 
can  be  viewed  as  layered  forms  of  derived  data;  (Cactis  and  Cacti  might  be  of  help  in 
dealing  with  this  this.  A  La  Carte  might  be  useful  in  assisting  the  database  user  in  choos¬ 
ing  and  integrating  specific  forms  of  recovery. 


The  design  engineer  will  also  want  less  restrictive  forms  of  concunency  control.  If 
a  item  is  currently  in  use,  and  another  engineer  wishes  to  use  it,  he  would  rather 
electronically  tap  the  current  user  <»  the  shoulder  and  ask  for  a  copy  of  the  item.  Current 
ctmcuirency  control  algorithms  will  make  him  wait  an  arbitrary  amount  of  time.  This  is 
due  to  die  assumption  that  database  transactitms  are  done  in  batch  mode.  A  related  issue 
concerns  versions  (a  form  of  derived  data,  a  sort  of  which  is  supported  by  Cactis  and 
Cacti).  versioning  replace  concurrency  control  in  interactive  systems?  Perhaps 
rather  dum  locking  an  item,  a  user  will  merely  version  it  A  user  might  check  out  a  copy 
and  check  it  back  in  to  create  a  new  "cuirent”  version,  in  much  the  same  way  as 
engineers  now  check  in  and  check  out  design  documents.  This  of  course  brings  up  die 
age-old  problem  of  versimi  integration.  This  is  related  to  the  problem  of  reversing  the 
process  of  creating  derived  data.  If  you  change  a  piece  of  derived  data  (a  version)  how 
does  it  affect  the  original? 

Also,  as  structural  enci^sulation  (semantic  modeling)  and  behavioral  encapsulation 
(object-oriented  modeling)  become  mne  popular,  a  challenging  question  will  arise.  How 
will  the  many  fcums  of  behavioral  encapsulation  be  integrated?  Or  better  yet,  should 
they?  Cactis  and  Cacti  use  derived  data  as  their  behavioral  techniques.  A  La  Carte  uses 
methods.  They  are  very  similar,  but  not  identical.  There  is  no  message  passing  paradigm 
in  Cactis  and  Cacti,  and  a  method  does  not  normally  store  the  result  of  its  actions,  unless 
the  user  codes  it  that  way.  The  problem  does  not  stop  there.  How  do  methods  and 
derived  data  relate  to  rule  systems  and  constraint  languages?  Will  we  want  a  DBMS  to 
support  all  four  of  these  techniques  or  just  one?  And  if  we  want  to  mix  them,  what 
theoretical  and  algorithmic  work  needs  to  be  done?  A  related  issue  is  that  of  integrating 
more  complex  forms  of  manipulations  (such  as  methods,  rules,  constraints,  and  derived 
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data)  widi  tnditioiial  set-oriented  and  aggregate  operations;  after  all  we  do  not  want  to 
lose  die  amiability  that  relational  systems  are  so  good  at 

One  interesting  question  is  how  all  this  new  technology  could  be  used  by  business 
environments.  I  believe  diat  data  i»ocessing  systems  do  indeed  represent  conqilex 
objects,  versions,  methods,  rules,  and  derived  data  •  they  are  just  accustomed  to  embed¬ 
ding  them  in  application  software  and  have  never  had  more  powerful  tods  availaUe. 

Probably  one  avenue  that  (in  my  opini(»)  should  not  be  pursued  is  die  perverting  of 
current  relational  paradigms  to  solve  these  problems.  The  various  attempts  by  some 
researchers  to  make  relational  systems  lode  like  semantic  systems  were  very  unsuccess¬ 
ful.  This  is  due  to  the  inherent  lack  of  al^traction  in  the  relational  model.  The  key  prob¬ 
lem  is  that  the  relational  model  has  no  concept  of  an  objea  -  all  database  items  are  made 
up  of  identifiers  and  cannot  be  recursively  comdned  into  complex  objects.  I  believe  that 
current  attempts  at  taking  the  relational  model  and  making  it  look  (behaviorally)  object- 
oriented  will  fail  fix’  the  same  reason.  No  matter  what,  the  user  will  be  conscious  of 
manipulating  identifiers  -  not  objects. 


5.  A  Grand  Challenge 

A  few  months  ago,  a  group  of  about  forty  researchers  met  for  two  days  in  the  Napa 
valley  [13].  All  participants  were  active  researchers  in  either  the  software  engineering  or 
database  realm.  There  were  no  scheduled  presentations.  The  goal  was  to  determine,  as  a 
working  group,  die  research  tasks  which  needed  to  be  performed  in  (xder  to  provide 
effective  database  support  for  software  enviroiments.  (A  software  environment  was  gen¬ 
erally  accepted  as  a  software  system  which  assists  program  developers  in  the  design,  cod- 


ing,  debugging,  depk^ment,  maintenance,  and  eventual  reuse  of  software  systems.) 
Many  different  ideas  were  discussed  -  but  a  surprising  result  came  out  the  worksht^. 
It  was  generally  felt  that  the  biggest  challenge  was  the  integration  of  all  of  the  many 
research  results  that  are  currently  being  published  in  the  software  environment  database 
area. 

Indeed,  the  central  problem  seems  to  be  that  there  is  no  underlying,  integrated 
model  or  representation  for  software  database  support  Many  authors  feel  that  "object- 
cniented"  databases  are  the  answer  -  but  no  one  could  agree  on  the  definition  of  the  term. 
And,  unlike  other,  more  mathematically  tractable  models  such  as  the  relational  model, 
there  is  no  clear  way  of  representing  the  implementatitm  of  object-oriented  databases. 
The  same  is  true  for  object-oriented  query  specification  and  optimization.  My  feeling  is 
that  object-<»ieated  databases  are  not  at  all  identical  to  engineering  databases;  it  is 
merely  true  that  the  object-oriented  paradigm  is  a  promising  platform  for  studying 
engineering  issues. 

Furthermore,  while  many  researchers  are  working  on  new  ways  of  managing  large 
and  COTiplex  objects,  of  executing  long  and  interactive  and  staged  transactions,  of  inter¬ 
facing  graphically  with  a  complex  database,  and  of  implementing  novel  fcams  of  con¬ 
currency  ccmttol  -  it  is  not  clear  what  challenges  lie  in  putting  all  these  things  in  one  sys¬ 
tem.  This  would  require  a  very  large  scale  research  platform.  And,  it  is  clear  that  these 
various  new  software  mechanisms  will  interact  in  as  yet  unknown  ways.  The  big  ques¬ 
tion  is:  Will  a  unifcmn,  understandable,  and  implementable  theory  of  object-oriented 
database  design  and  implementation  arise?  Indeed,  our  hope  is  that  Gictis,  Cacti,  and  A 
La  Carte  are  a  step  in  this  direction. 
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6.  Rcseardi  Tnmaitioiis 


My  own  goal  is  to  link  up  as  much  as  possible  with  existing  engineering  projects,  to 
see  if  the  results  of  my  work  are  usable.  The  DARPA  sponsored  project  Arcadia  [11]  is 
currently  my  main  target  project  Already,  I  have  learned  a  few  things.  For  example,  the 
algorithms  used  in  Cactis  to  schedule  derived  confutations  are  viewed  as  too  restrictive 
by  many  engineers.  They  would  like  more  crnitrc^  over  how  the  system  makes  decisions. 
This  can  benefit  the  system  by  providing  crucial  information  that  may  be  very  hard  to 
deduce  automatically.  An  example  is  that  a  user  may  know  that  he  is  about  to  sudttenly 
shift  his  area  of  focus,  and  if  he  is  able  to  warn  the  database,  Cactis  can  adapt  more 
quickly. 

7.  Teclmological  Impacts 

The  biggest  technological  area  that  will  impact  this  woik  is  parallel  and  distributed 
computing.  The  ready  availability  of  high-speed  netwtnks  and  workstaticms,  the  growth 
of  Icmg-haul  networics,  the  develtfinent  of  many-processor  machines,  the  introduction  of 
parallel  channels,  and  the  development  of  disk  arrays  will  all  affect  database  technology. 
This  is  why  the  algorithms  in  Cactis  were  designed  to  be  naturally  parallel. 

8.  Sodctid  Issues 

It  goes  without  saying  that  ONR  support  has  dramatically  affected  my  career.  Early 
and  intensive  financial  support  enabled  me  to  get  my  research  program  off  the  ground 
and  quickly  produce  solid  results.  Further,  two  of  my  PhD  students  are  now  professors 
(Scott  Hudson  of  the  University  of  Arizona  and  Nabil  Kamel  of  Michigan  State),  and  I 
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am  currently  working  with  ten  other  PhD  students.  Of  course,  I  am  sure  that  not  all  of 
them  will  stay  with  me,  but  students  are  naturally  aturacted  to  large,  active  research  pro¬ 
grams.  In  sum,  I  am  a  very  strong  supporter  of  the  YIP  program. 

The  (xily  other  societal  issue  I  might  pdnt  out  is  that  the  grand  challenge  mentioned 
above  will  necessitate  a  much  mme  coordinated  and  co(^)erative  effcMrt  by  American 
researchers.  If  we  were  able  to  set  research  goals  at  a  national  level  (as  the  Jtq>anese  do), 
and  ctvporations  would  not  always  operate  in  a  strictly  product-based  manner,  much 
more  progress  could  be  made. 

9.  Recommendations  to  Funding  Agencies 

My  main  recommendation  is  that  federal  funding  agencies  should  not  try  to  set 
specific  research  goals  and  support  only  projects  related  to  these  goals,  unless  they  are 
also  prepared  to  help  researchers  coordinate  their  efforts.  Doing  the  first  without  doing 
the  second  only  produces  many  firagmented  projects  which  do  not  build  on  each  other.  I 
think  that  funding  larger,  multi-institution  projects  is  a  good  idea. 
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TRANSFORMATIONAL  PROGRAMMING  SYSTEMS  WITH  LARGE-SCALE 

AUTOMATION 

Robot  Plife 

0  ,1)  Abstract  and  Background 

Wa  ragard  tha  dlscovary  of  usaful  program  transformations  as 
part  of  a  natural  avolutionary  procass  that  baglns  with  tha 
discovary  of  informal  principlas  of  softvara  anginaaring  and 
laads  to  thair  formalization  and  machanization  within  a 
compilar.  For  this  procass  tosuccaad  wa  baliava  that  thraa 
assantial  mathodologlas  must  ba  davalopad  -  for  programming, 
transformations,  and  compilars.  Programming  mathodology  is 
tha  informal  but  assantial  principlas  that  facilltata  tha 
manual  construction  of  programs  and  tha  synthasis  of 
algorithms.  Transformational  mathodology  is  tha  mora  formal 
procass  of  syntactic  analysis  and  symbolic  manipulation  by 
which  programs  can  ba  improvad  automatically  or 
samiautomatically.  Compilar  mathodology  is  a  fully 
automatic  form  of  transformational  mathodology  usad  to 
implamant  a  programming  languaga. 

Thara  is  a  natural  procass  ^araby  programming  mathodology 
maturas  and  davolvas  into  transformational  mathodology  whosa 
parf act ion. laads  to  compilar  mathodology.  This  procass  is 
bahind  tha  avolution  of  low  laval  machina  languagas  into 
high  laval  languagas.  This  procass  involvas  tha  racognition 
that  major  common  pattams  of  programming  styla  in  a  low 
laval  languaga  can  result  from  tha  application  of  standard 
tachniquas  of  program  improvamant  to  higher  laval 
programming  prototypes.  Whan  such  a  technique  can  ba 
formalized  as  an  implamantad  'maaning-prasarving'  program 
transformation,  than  wa  can  conveniently  write  programs  at  a 
higher  laval  of  abstraction  without  a  penalty  in 
performance,  since  tha  efficient  lower  laval  versions  can  be 
derived  mechanically  or  samiautomatically  by  transformation. 

Such  transformations  fora  tha  assantial  lemmas  in  a  proof  of 
correctness  of  tha  iaplaaantation-leval  code.  When  these 
transformations  can  also  ba  associated  with  a  precise 
measure  of  improvamant  in  time  and  space,  then  tha  lemmas 
can  facilitate  a  proof  of  performance  as  well  as 
functionality  of  tha  low  laval  coda.  This  approach  to 
verification  laads  naturally  to  tha  Intriguing  idea  of 
problem  specification  languagas  that  fall  within  specific 
complexity  classes;  i.e.,  that  can  be  compiled  automatically 
into  implementations  whose  tine  and  space  bounds  are 
guaranteed  at  language  design  time.  In  this  approach  the 
assumed  correctness  of  a  specification  together  with  a 
correctness  proof  of  the  'supercompiler'  proves  the 
functionality  and  the  performance  of  the  compiled  code. 

2)  High  Level  Research  Objectives 

a.  to  automate  major  aspects  of  programming 


b.  to  dovolop  a  new  neane  of  software  production  that 
enables  us  to  Inplement  algorlthss  or  construct  software 
that  would  be  too  complex  to  program  with  existing 
technology 

c.  to  unify  problem  specification,  program  design, 
verification,  and  analysis  within  a  single  unified  framework 

d.  to  make  it  easier  to  teach  and  understand  algorithms  and 
software  engineering 

3)  Research  Issues 
(see  Appendix  I) 

4 )  Technical  Approach 

a.  The  recent  focus  of  our  project  Is  in  developing  a  new 
paradigm  of  'program  verification  by  compilation' .  Within 
this  paradigm  programs  are  verified  for  their  functionality 
and  performance.  This  Is  achieved  by  defining  a  problem 
specification  language  L  and  a  compiler  for  L  that  will 
always  generate  code  with  guaranteed  time  and  space 
complexity.  Thus,  a  single  theorem  stating  the  correctness 
of  the  compiler  for  L  is  sufficient  to  guarantee  the 
functionality  and  performance  of  any  program  compiled  from 
an  L  specification. 

b.  Program  transformations  such  as  fixed  point  iteration, 
finite  differencing,  stream  processing,  and  real-time  set 
machine  simulation  on  a  RAM  have  been  developed  within  our 
project  and  comprise  the  major  phases  of  the  compiler 
mentioned  In  part  (a)  above. 

c.  Underlying  any  programming  methodology  Is  the  theory  of 
algorithms.  Algorithm  implementations  are  also  the  most 
difficult  kinds  of  programs.  Thus,  we  are  beneflttlng 
greatly  by  studying  algorithms  and  their  derivation  by 
transformation  In  order  to  develop  Improved  programming  and 
transformational  methodologies. 

d)  We  are  making  use  o.  real-time  simulation  of  an  abstract 
set  machine  on  a  RAM  for  data  structure  selection.  This 
approach  seems  to  be  new  and  promising. 

e)  Rather  than  use  attribute  grammars  for  semantic 
analysis,  we  have  chosen  a  pattern  directed  approach  similar 
to  MENTOR/TYPOL. 

5)  Progress 

a.  With  Cal  we  have  developed  a  functional  problem 
specification  language  called  SQ-»-  (SETL  expressions  plus 
fixed  points)  that  can  express  all  partially  recursive 
functions.  We  show  how  to  compute  fixed  points  efficiently 
for  abstract  SQ+  functions  over  abstract  lattice  theoretic 
data  types. 

b.  In  POPL87  Cal  and  I  reported  on  a  subset  of  SQ+  called 


LI  that  can  always  ba  compllad  into  programs  that  run  on  a 
RAM  in  asymptotically  linear  time  and  space  with  respect  to 
the  problem  input/output  apace.  He  showed  that  a  significant 
fragment  of  an  optimizing  compiler  could  be  specified  in  LI. 
Recently,  Cai  has  shown  that  the  very  difficult  problem  of 
Planarity  Testing  can  also  be  specified  in  LI. 

c.  In  our  algorithm  research  with  Tar j an  we  have  have 
focussed  on  partition  refinement  as  an  algorithmic  strategy 
and  have  discovered  improved  algorithms  for  the  Single 
Function  Coarsest  Partition  Problem,  Lexicographic  Sorting, 
the  Relational  Coarsest  Partition  Problem,  and  Double 
Lexical  Ordering. 

d.  Cai  and  I  have  generalized  Hoffman  and  O'Donnell's  fast 
top-down  tree  pattern  matching  algorithm  to  be  incremental 
with  respect  to  patterns  and  to  handle  more  general  patterns 
with  several  pattern  variables  and  implicit  equality 
testing.  Cai  used  this  algorithm  to  design  and  implement  an 
inductive  definition  language  to  be  used  for  semantic 
analysis  in  RAPTS.  Tar j an  and  I  recently  obtained  some 
space  time  tradeoffs  for  a  potentially  useful  subclass  of 
these  patterns. 

e)  ?.e  recently  developed  a  methodology  for  implementing  sets 
and  maps  on  a  conventional  RAM  based  on  real-time 
simulation.  We  Intend  to  use  this  work  as  the  basis  for  the 
third  phase  of  the  LI  compiler,  which  should  be  operational 
by  the  end  of  the  summer. 

6)  Research  Directions 

a)  Can  our  transformational  methodology  be  applied  to 
machine  models  other  than  the  sequential  RAM?  What  about 
for  parallel  RAM's? 

b)  Can  we  see  a  great  improvement  in  reliability, 
performance,  and  labor  costs  in  constructing  software  using 
our  methodology.  In  particular  can  we  construct  the  major 
components  of  a  high  performance  optimizing  compiler  in  a 
relatively  brief  time  period  by  transforming  a  mathematical 
specification. 

c)  Can  our  transformations  be  usefully  generalized  and 
implemented  with  faster  and  better  algorithms?  Tree  pattern 
matching  seems  to  be  a  fundamental  operation  within  our 
methodology  Can  our  matching  algorithms  be  improved 
further?  Are  there  better  heuristic  or  approximation 
algorithms  to  solve  those  problems  that  are  NP-hard;  e.g., 
stream  processing? 

d)  Can  our  techniques  be  further  applied  to  database 
optimization  and  integrity  control? 

e)  Can  we  generalize  our  results  with  LI  to  develop  a  whole 
heirarchy  of  problem  specification  ' 4nguages  for 
complexities  at  each  polynomial  degree?  Is  there  a  useful 
subset  of  Prolog  equivalent  to  LI?  Can  we  define  a 
syntactic  class  of  attribute  grammars  (even  circular 


graraars)  that  fall  within  LI? 


7)  Ona  grand  practical  challanga  would  ba  to  ba  abla  to 
iiq»laMnt  a  high  parforaanca  (conparabla  to  coim)ila>  and 
run-  tiaaa  for  IBM's  bast  conpilar)  optiaizlng  FORTRAN 
coapilar  for  an  IBM  370  by  ganarating  it  froa  a  succinct  in 
LI  spaciflcation.  An  aabitious  thaoratical  challanga  would 
ba  to  produca  spaciflcation  languages  with  worst  casa 
coaplaxity  bound  to  aach  polynoaial  dagraa  and  whose  union 
has  the  saaa  power  as  Guravitch  and  Shalah's  polytiaalO 
language. 

8)  i)  Our  work  has  had  an  iapact  aostly  on  the 
transforaational  prograaalng  coaaunity,  but  also  on  the 
database,  prograaalng  language,  algoritha,  and  coaplaxity 
coaaunitles . 

We  had  a  aajor  iapact  on  a  European  ESPRIT  project  in  rapid 
prototyping  called  SED  with  participants  froa  Thoason-CSF 
and  IMRIA  in  Paris,  Enidata  in  Roaa,  U.  of  Patras  in  Greece, 
Hildashaia  U.  in  W.  Geraany. 

IFIPS  WG2.1,  the  original  ALGOL  working  group,  is  currently 
developing  a  new  aatheaatlcal  problea  specification  and 
prograaalng  language  within  a  transforaational  prograaalng 
environaant.  The  group  has  been  influenced  by  our  work  and 
has  asked  ae  to  head  a  siibgroup  on  prograa  transforaations 
including  finite  differencing. 

The  Dutch  govemaent  has  begun  a  national  project  on 
transforaational  prograaalng.  I  was  invited  to  the 
Netherlands  to  present  the  results  of  our  project. 
Representatives  froa  the  Prospectra  Project,  project  CIP, 
Kestrel,  ISI,  and  other  groups  seea  to  have  been  influenced 
by  our  work  in  finite  differencing  and  want  to  utilize  our 
current  work  in  fixed  point  iteration  and  data  structure 
selection  by  real-tiae  slaulation. 

A  group  of  researchers  including  Neil  Jones  froa  DIKU 
working  with  the  transforaational  paradiga  of  aixed 
coaputatlon  feel  soae  iapact  of  our  work  and,  consecpiently, 

I  8UB  an  invited  speaker  at  the  ESOP  conference  in  Copenhagen 
in  May  1990. 

The  database  coaaunity  (e.g.,  N.  Roussopoulos)  seeas  to  have 
been  influenced  by  our  earlier  work  in  Integrity  control. 

The  language  theory  coaaunity  seeas  to  be  interested  in  the 
algoritha  work  with  Tar j an  on  the  relational  coarsest 
partition  problea 

We  inspired  Gurevich  and  Sheledi  to  investigate  function 
classes  coaputable  with  respect  to  input  and  output  space. 

ii) Before  anyone  else  uses  the  results  of  our  technology,  it 
would  be  aost  convincing  if  we  would  be  able  to  use  it 
ourselves.  This  is  only  now  starting  to  happen.  By  the  end 
of  the  suaaer  when  we  expect  our  LI  coapller  to  generate 


lin«ar  tlaa  C  code,  we  will  be  in  a  better  position  to 
assess  how  the  technology  can  best  be  applied  and  by  %dioa. 
Our  hope  is  that  we  can  consider  implementing  code  for  the 
forthcoming  180860  Intel  chip,  a  1  million  gate  RISC  mini- 
Cray  I. 

9)  The  RAFTS  project  would  benefit  from  a  few  SUN  3/60's  and 
half  of  a  super  eagle  disk.  SUN  3/60's  could  overcome  some 
of  the  speed  problems  we  have  with  SETL  as  our  main 
implementation  language.  It  would  also  make  it  easier  to 
run  the  prototyping  systems  MENTOR  and  the  SED  environment, 
which  are  relevant  to  our  work.  However,  space  is  a  more 
serious  problem  for  us,  because  of  SETL  and  also  our 
reliance  on  large  tables  to  facilitate  fast  pattern  matching 
to  implement  transformations. 

10)  This  year  there  too  many  qualified  faculty  applicants 
for  too  few  positions  in  the  U.S.  At  NYU,  where  we  received 
over  200  applications,  we  have  been  reluctant  to  send  out 
rejection  letters.  If  this  situation  continues,  I  believe 
that  it  would  be  humane  and  also  scientifically  sound  for 
funding  agencies  to  allocate  more  funding  for  post-doc 
positions  and  less  for  students. 

11)  Recommendations  on  Funding 

Transformational  programming  and  parallel  computation  are 
two  emerging  fields  that  may  ultimately  depend  on  each  other 
for  success.  Perhaps,  because  ad  hoc  programming  on 
sequential  machines  is  so  straightforward,  sequential 
programming  methodology  has  had  little  impact  outside  the 
academic  community,  and  transformational  methodology  has  had 
little  impact  at  all.  However,  because  ad  hoc  programming 
for  parallel  machines  is  so  hard,  and  because  progress  in 
software  construction  has  lagged  behind  architectural 
advances  for  such  machines,  there  is  a  much  greater  need  to 
develop  parallel  programming  and  transformational 
methodologies.  ONR  should  stimulate  research  on  formal  ways 
to  overcome  problems  of  parallel  computation  -with  respect 
to  both  software  development  and  algorithm  design. 

Specific  problems  include  derivations  of  synchronous  and 
asynchronous  parallel  algorithms,  systems,  and 
architectures.  These  derivations  may  include  sequential  to 
parallel  machine  translation,  retiming  and  reconfiguration 
techniques,  and  techniques  for  simulation  of  one  kind  of 
parallel  machine  in  another.  Incorporation  of 
specification,  design,  verification,  and  analysis  into 
parallel  programming,  transformational,  and  compiler 
methodologies.  Also  applications  of  transformational  and 
derivation  techniques:  for  explication  of  difficult  parallel 
algorithms (e.g. ,  parallel  graph  algorithms),  and  to  solve 
real-world  problems  are  of  special  interest  (e.g., 
derivations  of  communication  protocols  and  database  c[uery 
optimization  for  distributed  computation) . 

Appendix  I.  Research  Issues 

Hypothesis:  The  theory  of  algorithms  underlies  the  science 


of  programilng.  A  vlablo  thoory  of  probloa  spociflcatlon  and 
prograa  transfonationn  would  provlda  the  basis  for  a  thsory 
of  algorithm  and  program  dssign. 

Main  Objsctivs:  Ths  discovsry  of  basic  program 
transformations  that  capture  principles  of  algorithm  design; 
implementation  and  application  of  these  transformations 
within  the  RAFTS  system. 

Motiviation  for  Research: 

Four  general  difficulties  with  current  program  construction 
methodologies  (originating  with  Knuth[68]  and  Aho,  Hopcorft, 
and  Ullman[74])  together  with  long  range  goals  of  our 
project  are  listed  below. 

1.  (Problem  Specification) 

Problems  are  specified  in  an  ad  hoc  way.  But  informal 
problem  specifications  can  be  confusing,  even  ambiguous. 

This  is  bad  for  doctimentation  and  complicates  the  synthesis 
and  analysis  of  correct  programs. 

Our  goal  is  to  provide  a  formal  mathematically  based  problem 
specification  language. 

2.  (Program  Synthesis) 

Programs  are  constructed  informally. 

Our  goal  is  to  directly  map  problem  specifications  into 
efficient  implementations  by  applying  correctness  preserving 
transformations . 

3.  (Program  Correctness) 

Program  proofs  are  tied  closely  to  implementation  level 
code.  Consequently,  they  are  lengthy,  complicated,  and 
unconvincing . 

Our  goal  is  to  guarantee  the  correctness  of  an 
implementation  by  proving  the  correctness  of  the  problem 
specification  and  the  transformations  used  to  derive  the 
implementation . 

4.  (Program  Analysis) 

Time  and  space  complexity  of  implementations  depends  on  a 
rigorous  analysis  of  the  low  level  features  of  the 
implementation  code,  undependent  of  any  design  principles. 

Our  goal  is  to  integrate  performance  analysis  with  the 
synthesis  process. 


Program  construction  using  a  supercompiler: 

Abstract  Problem  Specification  (supplied  by  user- 


//\\  proved  correct  with  aid  of  systaa) 

//  W 

//  WTransfomations  (appllad  and  justiflad 
//  \\  by  ayataa  %dianavar  poasibla) 

//  W 

V  V 

Implaaantatlon  Parforaanca  (Output  by  systan) 
Spacificatlon 


Main  Sourcas  of  transfornational  programing  mathodology: 

Topdown  Stapwisa  Rafinaaant:  Dijkstra  and  Mirth [lata  60 's] 
Ganaral  Idaa  for  a  systan:  ChaathaB[72] 

Corractnass:  Floyd [67],  Hoara[69] 

Transfornational  Corractnass:  Gerhart [75] 

Mechanical  Parfomance  Analysis:  Ran8haw[79] , J.  Cohen [82] 
Parfomanca  Analysis  by  Transfomation:  Millard  [83] 
Specification  Language: 

VERS2-Earlay [ 74 ] ,  SETL-Schwartz [ 77 ] ,  LCF/ML-Gordon , Milner , 
Madsworth [ 79 ] ,  Algorithmic  Languaga-Bauar [ 82 ] 
Transformations : 

Recursion  to  Itaration-Malkar, Strong [73] 

Oynanic  Programing-Bird[80]  ,N.  Cohan [83] 

Stream  Procassing-Allan,Cocka[71] ,Morganstam[76] ,Burstall 
and  Darlington [77] ,Raif ,Scharlis[82] 

Finite  Diffarancing*Briggs[l6th  century] , Cocke, Schwartz [69] , 
Cocke , Kennedy [ 77 ] , Earley [ 76 ] , Fong , Ullman [ 76 ] , Fong [77,79] 


Related  work  in  formal  program  development  mathodology 

1.  ad  hoc  program  construction  and  formal  verification  and 
proof  checking 

Manna 

Luckham 

etc. 

see  Lipton,  demlllo,  perils  for  criticism 

2 .  synthesis 
constructive  proof 

Meuuia  and  Maldinger 

Goad 

Bibel 

Bates 

Much  manual  intervention  -  efficiency  is  not  considered 

Equational  Approach 
Guttag 
Huet 

Hoffman  and  O'donnell 

Efficiency  is  of  even  less  concern  -  but  mechanization  is  a 
plus 


3 .  Transformational 


Bau«r 

C3i«athaa 

Burstall  and  Darlington 

Much  aanual  intarvantlon,  vast  transforaatlonal  librarias, 
long  alnlass  darlvation  saquancas,  unpradlctabla  capacity 
for  inprovasant 


Casa  Study 

Thraa  basic  progran  transfomations  of  wida  applicability 
hava  baan  davalopad  and  implanantad  within  RAPTS.  Davalopnant 
of  a  fourth  basic  transfoniation  is  in  prograss.  Thay  ara 
illustratad  balow. 

1.  Solving  Roots  of  Sat  Thaoratic  Pradicatas  *—  Tha  Ganasis 
of  Algorithmlc\Strategy 

Rastrictions 

Datamlnata  problems 
Executable  Specs 
Finite  Sets 

determinate  problems  such  as  these  are  reminiscent  of  Linear 
Programming,  but  the  solution  method  we  devise  is  similar  to 
the  iterative  techniques  used  to  find  approximate  solutions 
to  numerical  equations. 


find  the  unique  solution  S  by  the  following 
iterative  procedure: 


The  solution  method  above  is  based  on  Tarski.  When  it  can 
be  applied,  the  solution  to  the  original  problem  is 
guaranteed  to  run  in  polynomial  time. 

Our  transformation  is  described  in  Paige [84, 84 ] . 
Generalization  of  this  transformation  to  finding  solutions 
in  partition  spaces  played  a  central  role  in  the  explanation 
of  a  new  improved  algorithm  to  solve  the  single  function 
coarsest  partition  problem  Paige, Tar j an [ 84 ] . 

2.  Finite  Differencing  —  The  Efficient  Implementation  of 
Strategy 

Further  improvement  to  the  attribute  closure  procedure 
generated  by  the  previous  transformation  can  be  achieved  by 
automatic  application  of  finite  differencing 
transformations,  so-called  because  they  derive  from  Briggs's 
16th  century  method  of  polynomial  tabulation  using 
difference  polynomials.  But  instead  of  tabulating 
polynomials,  we  want  to  tabulate  expressions  of  various  data 
types.  For  this  example.  Instead  of  computing  the 
expression  *****  in  the  naive  way  each  time  through  the 
loop,  we  will  tabulate  the  value  of  this  expression  for  each 


succ«sslv«  valu«  of  S  in  an  Inaxpanslva  Incraaantal  way.  Our 
tachniqua  avoida  rapaatad  calculation  of  ******  by 


i.  astabliahlng  tha  following  four  Invariants  on  antry  to 
tha  loop: 


Tha  aystaa  astablishas  tha  Invariants  using  a  loop  cosbining 
transformation  callad  straam  procassing  (saa  Paiga, 
Koanig[82] ,6oldbarg,PaigaC84]) . 

ii.  maintaining  thasa  invariants  just  aftar  thay  ara  spoilad 
by  tha  modification  to  S  within  tha  loop.  Tha  maintananca 
coda  is  callad  'diffaranca  coda'  and  is  ganaratad  by  RAPTS 
according  to  a  chain  rula. 

iv.  raplacing  tha  computation  ****,  mada  radundant  within 
tha  loop,  by  tha  variabla  naw. 

Basad  on  structural  propartias  of  thasa  aight  invariants, 
RAPTS  automatically  datarminas  that  tha  cxmulativa  cost  of 
astablishing  and  maintaining  tham  is  0(***)  with  raspact  to 
a  sat  thaoratlc  complaxlty  maasura.  It  than  ba  datarminad 
that  tha  whole  procedure  runs  in  o(***)  steps. 


Our  work  in  Finite  Differencing  goes  back  to 
Paige, Schwartz [77]  and  includes  Paiga[79,83,84,84] , 
Koenig, Paige [81] , Paige, Koenig [82] . 


APPENDIX  II.  RAPTS  EXAMPLES 

Below  ara  examples  of  code  generated  automatically  by  tha 
protototypa  Ll  compiler  from  the  given  specifications  using 
RAPTS.  Tha  first  and  easiest  derivation  is  of  tha  graph 
reachability  problem.  Next  is  tha  derivation  of  a  live 
statement  analysis  algorithm.  Tha  third  example  shows  how 
tha  fast  Bamstain  and  Bear!  attribute  closure  algorithm  can 
ba  compiled  from  a  simple  problem  specification.  Tha  final 
example  is  a  simple  constant  propagation.  Throughout  this 
session  with  only  one  exception  manual  intervention  is 
required  only  to  supply  the  names  of  variables  introduced  by 
transformations.  The  one  exception,  is  in  the  constant 
propagation  example  where  the  property  of  monotonicity  could 
not  be  deduced  within  the  current  implementation  (a 
practical  rather  than  a  theoretical  shortcoming) . 

EXAMPLES 

1.  Graph  Reachability 
program 

program  tclose; 
read  (  e  ,  w  )  ; 

print  (  the  s:  w  subset  s  |  e[s]  subset  s  minimizing  Is  ) 
end  program  ; 


prograa 

prograa  raach  ; 
rMd  (  •  ,  V  )  ; 

«:-()» 
b  :-  {  )  » 

(  forall  x27  in  a  ) 

(  forall  x26  in  a  (  x27  )  |  x26  notin  b  ) 
b  with  t«  x26  ; 
and  forall  ; 
and  forall  ; 
d:-{)? 
c  ; 

(  forall  x29  in  v  ) 
if  x29  notin  a  than 
d  with  :*  x29  ; 
end  if  ; 
c  with  :«  x29  ; 
and  forall  ; 

(  forall  x29  in  b  |  x29  notin  c  ) 
if  x29  notin  a  than 
d  with  :•  x29  ; 
and  if  ; 
c  with  :•  x29  ; 
and  forall  ; 

(  tdiila  exists  x24  in  d  ) 

(  forall  x28  in  a  {  x24  }  |  x28  notin  b  ) 
if  x28  notin  w  than 
if  x28  notin  a  than 
d  with  :•  x28  ; 
and  if  ; 
c  with  :«  x28  ; 
and  if  ; 
b  with  :«  x28  ; 
and  forall  ; 
if  x24  in  c  than 
d  lass  :«  x24  ; 
end  if  ; 
a  with  :»  x24  ; 
end  while  ; 
print  (  a  )  ; 
end  prograa  ; 

2.  Live  Code  Analysis 

prograa 

prograa  useless  ; 

assuaa  onaona(instof ) ; 
assuae  oneaany(iuses) ; 
assuaa  nanyona(coapound) ; 

assuae  disjoint (range  instof, range  coapound) ; 
read  (  instof  ,  usetodef  ,  iuses  ,  coapound  ,  crit  )  ; 
print  (  the  live:  crit  subset  live  |  (instof  [  usetodef  [  iuses  [ 
live  ]  ]  ]  +  compound  [  live  ]  )  siibset  live  minimizing  flive) 

end  ; 
program 

program  useless  ; 

assume  oneone  (  instof  )  ; 
assume  onemany  (  iuses  )  ; 


•■SUM  Mnyon«  (  coapound  )  ; 

•■SUM  disjoint  (  rsngs  instof  ,  rsngs  coiq;>ound  )  ; 
rssd  (  instof  ,  UMtodsf  *  iusss  ,  compound  ,  crit  )  ; 
•  :•  orit  ; 
d  :-  (  >  » 
c  ; 

•*-()> 

(  forsll  x3  in  •  ) 

(  forsll  x2  in  iusss  {  x3  >  ) 

(  forsll  x7  in  usotodsf  {  x2  )  |  x7  notin  c  ) 
d  with  :*  instof  (  x7  )  ; 
c  with  :*  x7  ; 

•nd  forsll  ; 
snd  forsll  ; 

if  compound  (  x3  )  notin  •  thsn 

•  with  :«  cospound  (  x3  )  ; 
snd  if  ; 

•nd  forsll  ; 
g  s-  {  >  ; 
f  •*  (  )  # 

(  forsll  xio  in  d  ) 
if  xlO  notin  a  thsn 
g  with  :>  xlO  ; 

•nd  if  ; 
f  with  :«  XlO  ; 

•nd  forsll  ; 

(  forsll  XlO  in  •  ) 
if  XlO  notin  s  thsn 
g  with  :«  XlO  ; 

•nd  if  t 
t  with  :•  XlO  ; 

•nd  forsll  ; 

(  %diilo  oxists  xl  in  g  ) 

(  forsll  x4  in  iusss  (  xl  }  ) 

(  forsll  x7  in  usotodsf  {  x4  }  |  x7  notin  c  ) 
if  instof  (  x7  )  notin  s  thsn 
g  with  :»  instof  (  x7  )  j 
•nd  if  ; 

f  with  :■  instof  (  x7  )  ; 
c  with  :*  x7  ; 

•nd  forsll  ; 

•nd  forsll  ; 

if  coapound  (  xl  )  notin  •  then 
if  coapound  (  xl  )  notin  s  then 
g  with  :*  compound  (  xl  )  ; 

•nd  if  ; 

f  with  :*  coapound  (  xl  )  ; 

•  with  :■  coapound  (  xl  )  ; 
end  if  ; 

if  xl  in  f  then 
g  less  :«  xl  ; 
end  if  ; 
a  with  :»  xl  ; 
end  tdiile  ; 
print  (  a  )  ; 
end  ; 


3. 


Attribute  Closure 


prograa 

program  aelosa  ; 

raad  (  x  ,  f  )  ; 

print  (tha  a:  x  subaat  a  |  forall  y  in  domain  f  | 

y  subsat  s  i^^l  f(y)  subsat  s  minimizing  #s  ) 

and  t 
program 

program  aelosa  ; 
raad  (  x  ,  f  )  ; 
a  :«  X  ; 
h  ; 

(  forall  x21  in  domain  f  ) 

(  forall  x20  in  x21  ) 
h  (  x20  }  with  :«  x21  ; 
and  forall  ; 
and  forall  ; 
c  ; 

(  forall  xll  in  domain  f  ) 

(  forall  xlO  in  xll  ) 
if  xlO  notin  a  than 

c  (  xll  )  :-  (  c  (  xll  )  ?  0  )  +  1  ; 
end  if  ; 
end  forall  ; 
and  forall  ; 

g  !-  (  )  ; 

a  ; 

(  forall  xl6  in  domain  f  ) 
if  c  (  xl6  )  «  0  than 

(  forall  xl9  in  f  (  xl6  )  \  xl9  notin  a  ) 
if  xl9  notin  a  than 
g  with  xl9  ; 
end  if  ; 
a  with  :»  xl9  ; 
end  forall  ; 
end  If  ; 
end  forall  ; 

(  while  exists  x9  in  g  ) 

(  forall  xl3  in  h  (  x9  }  ) 
if  xl3  in  domain  f  then 

if  (  not  c  (  xl3  )  »  0  )  and  c  (  xl3  )  »  1  then 
(  forall  xl9  in  f  (  xl3  }  |  xl9  notin  e  ) 
if  xl9  notin  a  then 
g  with  :»  xl9  ; 
end  if  ; 
e  with  :»  xl9  ; 
end  forall  ; 
end  if  ; 
end  if  ; 

c  (  xl3  )-:=!; 
end  forall  ; 
if  x9  in  e  then 
g  less  :»  x9  ; 
end  if  ; 
a  with  x9  ; 
end  while  ; 
print  (  a  )  ; 
end  ; 


4 .  Constant  Propagation 


prograa 
program  const; 

read  (assign  fUSStodsffComputa) ; 
print (tlM  const:  empty  subset  const  | 

(const  •  (s  in  assign  |  (forall  t  in  (domain  usetodef)(s} 

I  ((forall  X  in  usetodef { [a,t] )  |  x  in  const)  and 
«(compute(x) :  x  in  usetodef { [s, t] )  *  const)  <- 
minimizing  #  const) ; 

end; 


1)) 


program 

program  const  ; 

read  (  assign  ,  usetodef  ,  compute  )  ; 
out  :«  empty  ; 

■!-{>; 

(  forall  X45  in  domain  usetodef  ,  x46  in  usetodef  {  x45  )  ) 
m  {  x46  )  with  x45  ; 
end  forall  ; 
b  ; 

(  forall  [  xl9  ,  x20  ]  in  usetodef  ) 
if  x20  notin  out  then 
b  (  xl9  )  +  :«  1  ; 
end  if  ; 
end  forall  ; 
h  :«  (  )  ; 

g  s-  (  )  ; 

(  forall  C  x30  ,  X31  ]  in  usetodef  ) 
if  X31  in  out  then 

if  compute  (  x3l  )  notin  g  {  x30  )  then 
h  (  X30  )+:-!; 
g  {  X30  )  with  :»  compute  (  x31  )  ; 
end  if  ; 
end  if  ; 
end  forall  ; 

j  ; 

(  forall  [  x41  ,  x40  ]  in  (  domain  usetodef  )  ) 

if  h  (  x41  ,  X40  )  >  1  then 

j  (  X41  )  +  ;-  1  ; 
end  if  ; 
end  forall  ; 
d  :-  {  )  ? 

(  forall  (  x26  ,  x25  ]  in  (  domain  usetodef  )  ) 

if  b  (  x26  ,  X25  )  >  0  then 

d  (  X26  )+;»!; 
end  if  ; 
end  forall  ; 

1  :*  {  )  ; 

k  :*  {  }  ; 

e  ;=  {  }  ; 

(  forall  x29  in  assign  ) 

if  d  (  x29  )  »  0  then 
if  j  (  x29  )  »  0  then 
if  x29  notin  out  then 
1  with  :»  x29  ; 
end  if  ; 
k  with  :»  x29  : 


•nd  if  ; 

•  with  :■  x29  ; 

•nd  if  ; 

•nd  forall  ; 

(  vhil^  •xista  x2  in  1  ) 

(  forall  X21  in  ■  {  x2  }  ) 

if  X21  in  (  domain  usatodaf  )  and  b  (  x21  )  «  i  than 
if  x21  (  1  )  in  assign  than 
if  d  (  x2l  (  1  )  )  -  0  than 

if  j  (  x21  (  1  )  )  -  0  than 

if  x2l  (  I  )  notin  out  than 
1  lass  :«  x21  (  1  )  ; 
and  if  ; 

k  lass  :«  x21  (  1  )  ; 
end  if  ; 

a  lass  x21  (  1  >  ; 
alsaif  d  (  x21  (  1  )  )  -  0  +  1  then 

if  j  (  X21  (  1  )  )  «  0  than 

if  x2l  (  1  )  notin  out  than 
1  with  x21  (  1  )  ; 
and  if  ; 

k  with  x21  (  1  )  ; 
and  if  ; 

a  with  :■  x21  (  1  )  ; 
end  if  ; 

•nd  if  ; 

d  (  x21  (  1  )  )  -  :»  1  ; 

•nd  if  ; 

b  (  X21  )  -  ;«  1  j 
•nd  forall  ; 

(  forall  X35  in  a  {  x2  )  ) 
if  compute  (  x2  )  notin  g  {  x35  )  then 
if  x35  in  (  domain  usatodaf  )  than 
if  h  (  x35  )  -  1  -  1  then 
if  x35  (  1  )  in  a  then 

if  j  (  x35  {  1  )  )  -  0  then 
if  x35  (  1  )  notin  out  then 
1  less  x35  (  1  )  ; 
end  if  ; 

k  less  :»  x35  (  1  )  ; 
elseif  j  (  x35  (  1  )  )  -  0  -  1  then 
if  x35  (  1  )  notin  out  then 
1  with  :=  x35  (  1  )  ; 
end  if  ; 

k  with  :»  x35  (  1  )  ; 
end  if  ; 
end  if  ; 

j  (  x35  (  1  )  )  +  :»  1  ; 
elseif  h  (  x35  )  =  1  then 
if  x35  (  1  )  in  e  then 

if  j  (  x35  (  1  )  )  =  0  then 
if  x35  (  1  )  notin  out  then 
1  less  x35  (  1  )  ; 
end  if  ; 

k  less  :»  x35  (  1  )  ; 
elseif  j  (  x35  (  1  )  )  =  0  +  1  then 
if  x35  (  1  )  notin  out  then 
1  with  :■  x35  (  1  )  ; 
end  if  ; 


k  with  :•  x35  (  1  )  ; 
•nd  if  t 
•nd  if  ; 

j  (  X35  (  1  )  )  -  i  ; 

•nd  if  ; 

•nd  if  ; 

h  (  X35  )+;-!; 
g  (  x35  }  with  conputw  (  x2  ) 
•nd  if  ; 

•nd  forall  ; 
if  x2  in  k  th^n 
1  !•••  :«  x2  ; 

•nd  if  ; 

out  with  :«  x2  ; 

•nd  whil^  ; 
print  (  out  )  ; 
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Abstract 

The  project’s  research  emphasis  is  on  the  computational  and  mathematical  infras¬ 
tructure,  needed  to  support  the  software  development  of  a  geometric  modeling  system. 
This  is  a  software  system  which  allows  for  the  efficient  creation  and  manipulation 
of  concise  boundary  representations  of  curved  solid  models  of  physical  objects.  The 
geometric  coverage  includes  algebraic  curves  and  surfaces  of  arbitrary  degree,  and  al¬ 
lows  both  the  implicit  and  rational  parametric  representations  with  both  power  and 
Bernstein  polynomial  bases.  Utilities  are  provided  for  automatic  conversicms  between 
the  various  polynomial  represantations.  Furthermore,  a  graphical  user  interface  allows 
creation  of  Bezier  control  polygons  and  polyhedra,  for  efficient  design  of  Bezier  curve 
segments  and  Bezier  surface  patches.  Geometric  operations  include  boolean  set  op¬ 
erations,  offsets,  sweeps,  solid  decompositions,  and  wireframe  flesh  via  interpolation. 
Graphical  display  facilities  include  quick  wireframe  plot  and  redisplay  of  the  bound¬ 
ary  representation,  three  dimensional  transformations  of  the  solid,  and  high  resolution 
color  rendering.  The  geometric  modeling  system  is  being  implemented  in  Conunon 
Lisp,  Fortran  and  C  on  a  combination  of  Symbolics  3650,  HP  -  370SRX  Turbo,  and 
SUN  4-110  platforms. 


Supported  in  part  ONR  contract  N00014-68-K-0402. 
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1  Background 

In  the  summer  of  1986  plans  were  made  for  developing  a  geometric  modeling  system  for  the 
efficient  creation  and  manipulation  of  accurate  computer  models  of  solid  physical  objects. 
A  primary  goal  since  then,  is  to  accurately  model  the  boimdary  of  rigid  physical  objects 
with  algebraic  surface  patches.  The  focus  is  on  using  the  lowest  degree  surface  patches 
which  satisfy  the  design  constraints,  since  lower  degree  surfaces  lends  itself  to  faster  compu¬ 
tations  in  geometric  design  operations  as  well  as  in  tasks  such  as  computer  graphics  display, 
animation,  and  various  physical  simulations.  The  project’s  research  emphasis  is  on  the  com¬ 
putational  and  mathematical  infrastructure,  needed  to  support  the  software  development  of 
this  geometric  modeling  system. 

Geometric  Coverage:  We  focus  on  the  use  of  low  degree,  implicitly  defined,  algebraic  surfaces 
in  three  dimensional  space  A  real  algebraic  surface  S  in  is  implicitly  defined  by  a 
single  polynomial  equation  f{x,y,  z)  =  0,  where  coefficients  of  /  are  over  the  real  numbers 
R.  A  real  algebraic  space  curve  can  be  defined  by  the  intersection  of  two  real  algebraic 
surfaces  and  implicitly  represented  as  a  pair  of  polynomial  equations  {fi{x,y,z)  =  0  and 
f2(x,y,  z)  =  0)  with  coefficients  again  over  the  real  numbers  R.  In  modeling  the  boundary 
of  physical  objects  it  suffices  to  consider  only  space  curves  defined  by  the  intersection  of  two 
algebraic  surfaces.  Space  curves  in  general  are  defined  by  the  intersection  of  several  surfaces. 
A  rational  algebraic  space  curve  can  also  be  represented  by  the  triple  {x  —  C?i(a),y  = 
G2(s),z  s=  Cjr3(s)),  where  Gt,  Gj  and  G3  are  rational  functions  in  s.  Whenever  we  consider 
the  special  case  of  a  rational  space  curve,  we  assume  that  the  curve  is  smooth  and  only  singly 
defined  under  the  parameterization  map,  i.e.,  each  triple  of  values  for  (x,  y,  z),  corresponds 
to  a  single  value  of  s. 

Why  algebraic  surfaces  ?  Manipulating  polynomials,  as  opposed  to  arbitrary  analytic  func¬ 
tions,  is  computationally  more  efficient.  Furthermore  algebraic  surfaces  provide  enough  gen¬ 
erality  to  accurately  model  almost  all  complicated  rigid  objects.  Also,  algebraic  curves  and 
surfcices  lend  themselves  very  naturally  to  the  difficult  computational  problem  of  physical 
object  design. 

Why  implicit  representations  ?  Most  prior  approaches  to  geometric  etnd  soid  modeling,  have 
focused  on  the  parametric  representation  of  surfaces.  Contrary  to  major  opinion  and  as  we 
exhibit  through  our  research,  implicitly  defined  surfaces  are  also  very  appropriate.  Addition¬ 
ally,  while  all  algebraic  surfaces  can  be  represented  implicitly,  only  a  subset  of  them  have  the 
alternate  parametric  representation,  with  x,  y  and  z  given  explicitly  as  rational  functions  of 
two  parameters.  Furthermore,  implicit  algebraic  curves  and  surfaces  have  compact  storage 
representations  and  form  a  class  which  is  closed  under  most  common  operations  required  by 
a  geometric  modeling  sj'stem. 
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2  Resear€:h  Issues,  Objectives  &  Directions 


•  Develop  computational  techniques  using,  algebrsuc  geometry  and  numerical  approxima¬ 
tion  theory  to  eliminate  bottlenedcs  in  geometric  modeling  operations.  These  include 
robust  Boolean  set  operations  (union,  intersection,  etc.),  solid  decompositions,  offsets, 
envdopes  and  sweep  computations.  Accuracy  and  robustness  are  two  of  the  most 
pressing  technical  issues  in  geometric  modeling. 

•  Develop  and  implement  efficient  "modular”  algorithms  for  algebraic  curve  and  surface 
parameterization,  implicitization  and  singularity  resolution.  Utilize  efficient  methods 
of  Chinese  remaindering,  Hensel  lifting  and  multivariate  interpolation. 

•  Develop  graphical  user  interfaces  for  easy  editing  of  geometric  object  information. 
Includes  three  dimensional  transformations  such  as  translation,  rotation,  scaling,  etc. 

3  Approaches  ^  Progress 

•  Algebraic  Boundary  Model  Creation  -  an  editing  toolkit. 

This  package  is  in  CLisp  ar.d  Fortran  on  the  Symbolics  and  is  one  of  continual  growth, 
[4,  11,  12,  16,  19,  21,  22,  23].  Capabilities  are  (a)  allows  quick  wireframe  plot  and 
redisplay  of  the  curved  solid  boundary  data  structure  for  arbitrary  algebraic  surfaces. 
Has  a  robust  surface-surface  intersection  routine  with  calls  to  a  Fortran  SVD  subrou¬ 
tine.  Also  a  revamped  makesolid  routine  to  produce  an  internal  form  of  the  boundary 
description  for  solid  manipulation  routines  (boolean  operations,  triangulation,...).  (b) 
produces  a  complete  boimdary  description  of  "offsets"  of  points,  line  segments,  an¬ 
gles  (two  line  segments),....  From  there  the  plot  capabilities  of  (a)  take  over  Takes 
input,  one,  two,  three  ....points  and  an  offset  radius.  3d  solid  transformation  routines 
i(scaling,  translation,  rotation)  are  also  implemented  and  used  to  derive  the  boundary 
description,  (c)  handles  "extrudes"  and  "curved  solids  of  revolution”  on  the  same  lines 
as  (b).  Study  of  robustness  issues  of  curved  model  reconstruction  and  display,  via 
symbolic  reasoning,  is  underway.  Benefits  all  projects  below  which  manipulate  solids 
with  boundary  descriptions.  (Summer  project  for  undergrad:  Implement  a  color  render 
program  of  the  solid  boundary  description  for  the  HP  graphics  workstation.) 

Current  Programmer.  Steven  Klinkner 

•  Robust  Polyhedra  Triangulation  -  robust  modeling  operation  using  the  topological 
reasoning  paradigm. 

This  package  is  in  CLisp  on  the  Symbolics  and  is  near  completion  [10].  Takes  as  input 
an  arbitrary  simple  polyhedra,  in  a  modified  Karasick’s  external  boundary  descrip¬ 
tion  and  produces  a  convex  decomposition/  triangulation  of  the  polyhedra.  All  the 
pieces  are  returned  in  the  boundary  description.  An  algorithm  was  developed  where 
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the  theoretical  boimd  of  Chazelle’s  old  1984  algorithm  was  improved  by  a  factor  of 
0{N^llogN),  where  N  =  number  of  reflex  edges.  Chazelle  recently  informed  us  that 
he  had  shaved  off  the  additional  0{logN)  factor  by  a  different  method.  Implementa¬ 
tion  was  done  to  better  understand  the  issues  of  robustness.  Rewrote  and  corrected 
some  of  Karasick’s  robust  classification  routines.  Developed  a  robust  plane-sweep  al¬ 
gorithm  for  detecting  edgecycle  loops.  Has  an  input  and  display  interface  from  the 
editing  toolkit  above,  as  well  as  from  S-Geometry.  Color  shaded  pictures  can  also  be 
produced.  Should  prove  useful  in  calculating  volumetric  properties  of  solid  models 
and  in  interfacing  simple  finite  element  programs.  Next  step:  Robust  Triangulation  of 
Curved  Solids. 

Current  Programmer.  Tamal  Dey 

•  Hermite  Interpolation  with  Algebraic  Surfaces  -  automatic  surface  generation. 

This  package  is  in  CLisp  and  C  on  a  HP  color  graphics  workstation.  Version  One 
completed  in  February  end  [15].  Currently  working  on  Version  Two  with  a  better 
user  interface  and  new  features.  Takes  as  input:  points  and  curves  in  space  together 
with  ’’normal”  directions  and  produces  a  family  of  the  lowest  degree,  algebraic  surface 
which  "smoothly”  interpolates  the  points  and  curves.  Nonsingularity  and  convexity 
constraints,  as  well  numerical  conditioning  issues  are  satisfied  by  doing  computations 
with  polynomials  in  Bernstein  basis  (as  opposed  to  the  traditional  power  basis).  Cur¬ 
rently  uses  the  Macsyma  routine  for  linear  system  solutions  over  integral  domains.  Has 
a  textual  input  interface  and  uses  the  graphics  workstation  hardware  for  color  display. 
Should  prove  useful  in  fleshing  curved  wireframes,  for  smooth  meshing  with  low  de¬ 
gree  surfaces,  generating  blending  and  joining  surfaces  and  for  constructing  low  degree 
curved  finite  elements. 

Current  Programmer.  Instmg  Ihm 

•  Package  for  Solving  Systems  of  Polynomial  Equations  -  all  roots  solver. 

This  package  is  in  CLisp  and  C  on  a  SUN  4  and  is  a  long  term  effort  [1,  2,  3,  6,  8, 
9,  13,  14].  The  goal  is  to  have  both  a  robust  and  efficient  set  of  routines  to  construct 
compact  representations  for  all  the  roots  of  a  general  system  of  multivariate  polyno¬ 
mial  equations.  For  0  dimensional  solutions  ,  approximate  real  solutions  are  obtained 
within  "epsilon”  neighborhoods  of  the  true  solutions.  For  k  dimensional  solutions  in 
n  space,  a  hypersurface  in  A;  -I- 1  dimensional  space  is  generated,  together  with  points 
on  the  true  solutions  expressed  as  rational  functions  of  points  on  the  hypersurface. 
Methods  are  based  on  Sylvester  and  Macaulay  resultants  and  subresultants  as  well  as 
symbolic  parameterization  routines.  A  very  fast  Sylvester  routine,  based  on  Chinese 
remaindering,  is  implemented  and  a  similar  implementation  for  Macaulay’s  resultant  is 
currently  underway.  Global  parameterization  routines  for  upto  degree  three  hypersur¬ 
faces  have  also  been  implemented.  Currently  uses  the  Macsyma  routine  for  univariate 
polynomial  real  root  solving,  and  curve  tracing  routines  to  display  the  zero,  one  and 


4 


two  dimensional  solutions.  The  package  has  an  interactive  user  interface  to  enter  equa¬ 
tions  and  select  different  ways  of  solving  and  displaying  solutions.  Provides  a  basic 
mathematical  package  of  polynomial  manipulation  routines  for  varied  applications. 

Current  Programmer.  Andrew  Royappa 

•  Power  series  factorizations  and  Fade  approximations  -  analyzing  curve  and 
surface  singularities. 

This  package  is  in  CLisp  on  the  Symbolics  and  is  complete  for  curves  [5,  7].  The 
algorithms  for  surfaces  are  simultaneously  being  developed  and  implemented.  Takes 
implicit  algebraic  curves  and  surfaces  as  input  and  produces  a  power  series  parame¬ 
terization  for  all  the  branches  at  a  singularity  (curve  branches  about  a  singidar  point, 
and  surface  branches  about  a  singular  curve).  Further  Fade’  rational  approximants 
can  be  computed  for  the  power  series  parameterizations.  Has  an  interactive,  textual 
user  interface,  where  different  degrees  of  approximation  can  be  specified.  The  output  is 
both  textual  and  graphical,  displaying  the  original  and  approximated  branches  of  the 
curve  and  surface.  The  algorithms  are  based  on  Hensel  lifting  of  power  series,  yielding 
Newton  and  Weierstrass  factorizations.  The  Fade’  routines  are  based  on  the  Brent, 
Gustavson  and  Yun  method  of  using  the  extended  GCD  algorithm  to  solve  Toeplitz 
matrix  computations.  Frovides  the  essential  routines  for  constructing  a  piecewise  ra^ 
tional  approximation  of  any  curve  or  surface.  Frojected  usefulness  in  modeling  and 
graphics. 

Current  Programmer.  Chanderjit  Bajaj 

•  Compliant  Path  Planner  -  generating  contact  paths  for  a  curved  object  with  fixed 
orientation. 

This  package  is  in  CLisp  on  the  Symbolics  and  is  complete  for  a  planar  curved  model, 
moving  with  fixed  orientation,  and  in  continuous  contact  with  other  static  planar 
curved  models  [17].  The  implementation  for  solid  models  is  pending  [18,  20].  The 
method  is  based  on  the  convolution  computation,  made  efficient  with  simple  ”  paint” 
hexiristics.  Has  a  menu  driven,  graphical  interface  to  specify  and  display  planar  model 
descriptions.  The  planar  models  currently  are  made  of  piecewise  circular  arcs  and 
straight  lines.  The  compliant  path  is  demonstrated  by  a  graphical  animation  of  a 
planar  model  moving  in  continuous  contact  with  the  fixed  curved  models.  Next  Step: 
Besides  upgrading  this  to  three  dimensions,  we  hope  to  interface  this  planner  with 
Newton. 

•  Multiple  Object  Motion  Coordination  -  path  generation  through  simulation. 

This  package  is  in  CLisp  on  the  Symbolics  and  is  first  being  programmed  for  coordinat¬ 
ing  the  simultaneous  collision  free  motion  of  homogenous  simple  discs  in  the  plane.  The 
approach  is  Voronoi  based,  where  at  each  time  step  a  disc  considers  only  its  voronoi 
neighbors  as  potential  collision  threats.  A  dynamic  planar  Voronoi  diagram  is  being 
implemented.  A  static  planar  Voronoi  diagram  is  already  complete.  The  velocity  and 
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acceleration  of  the  discs  is  handled  by  the  dynamic  equations  of  Newton.  Next  Step: 
To  move  onto  three  dimensions. 

Currtnt  Programmer.  Bill  Bouma 

4  Miscellaneous  Topics 

Grand  Challenge: 

The  accurate  re-design  of  the  exterior  geometry  of  the  space  shuttle  Discovery  or  carrier 
Enterprise  in  three*  hours,  or  less. 

Research  IVansitions: 

Geometric  modeling  tools  necessarily  find  a  wide  range  of  design  iq>plications  in  large 
volume,  manufacturing  industries  such  as  Boeing,  General  Motors,  Ford,  General  Dynamics, 
Department  of  Defense,  etc.  These  tools  are  being  used  for  the  design  of  the  exterior  geometry 
of  airplanes,  automobiles,  rockets,  ships,  space  shuttles,  ...  as  wdl  as  for  the  numerous 
parts  (motors,  engines,  drive-shafts,  etc)  that  they  consist  of.  More  recently  however,  they 
are  increasingly  being  used  by  personnel  involved  in  the  physical  and  bio-medical  sciences. 
Examples  abound  in  artifical  limb  design,  crystallography,  genetic  research,  pharmacuetical 
research,  and  more. 

Traditionally,  there  has  been  a  large  lag  time  between  university  research  and  its  success¬ 
ful  incorporation  for  specific  enhancements  in  industrial  products.  However,  with  the  close 
proximity  and  immediate  relevance  of  geometric  modeling  research  to  industrial  applications, 
this  gap  should  definitely  be  bridged.  Immediate  possibilities  are  through  joint  industry- 
university  conferences,  and  contractual  university  research  funded  by  industry,  while  long 
term  goals  may  be  met  by  industry  sponsored,  university  courses  and  laboratories. 

Technological  Impacts: 

Our  current  research  and  experimentation  hardware  consists  of  three  Symbolics  3650 
color  workstations  (about  3  MIPs  but  excellent  software  development  environment);  a  HP  - 
370SRX  Turbo  color  graphics  workstation  (  about  4  MIPs  with  graphics  accelerators  with 
hidden  surface  removal  and  display  transformations  such  as  rotate,  translate  and  zoom,  in 
hardware  or  firmware);  a  color  Sun  4-110  (about  7  MIPs  for  quick  computations,  and  useful 
in  experimenting  with  floating  point  and  polynomial  arithmetic);  and  access  to  an  Alliant 
FX-80  (  a  four  processor  number  cruncher). 

Integration  of  the  software  (and  hardware)  environment  of  these  machines  posed  a  big 
challenge  and  many  man  months  were  spent  achieving  a  certain  level  of  compatibility.  These 
problems  seem  to  have  been  somewhat  resolved  by  the  recent  announcements  of  graphics 
superworkstations  (e.g.  Ardent  Titan  or  the  Silicon  Graphics  IRIS  4D/240  GTX),  which 
combine  MIPS  power  with  graphics  pipelines,  and  seem  to  be  targeted  at  geometric  modeling 

^Estimated  as  the  maximum  single,  continuous  sitting  time  of  a  sophisticated  designer 
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and  simulation  projects  such  as  ours.  These  new  breed  of  machines  shall  definitely  lead  to 
enhanced  capabilities  for  research  and  experimentation  in  geometric  modeling  with  high 
degree  algebraic  surfaces. 
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MODELING  PHYSICAL  OBJECTS 


Christoph  M.  Hoffinann 
Computer  Science  Department 
Purdue  University 

Abetract 

The  research  develops  the  infrastructure  necessary  for  cmnprehensive,  user-friendly  software 
systems  that  are  capable  of  modeling  and  analysing  physical  objects  and  systems  of  physical  objects. 
The  focus  of  the  work  is  on  the  following  major  areas: 

e  The  logical  foundations  required  to  implement,  without  failure,  the  supporting  gemnetric 
operations  in  the  face  of  limited  precision  arithmetic  and  uncertainties  of  position. 

e  The  mathematical  foundations  of  object  representations,  with  specific  emphasis  on  effidency, 
robustness,  and  accuracy. 

e  The  devdopment  of  conceptual  primitives  to  support  the  design  process  and  to  interface 
diverse  mathematical  models  analyzing  physical  properties  in  a  variety  of  contexts. 

The  project  builds  software  tods  and  experimental  systems  that  assess  the  viability  of  our  ideas. 
Experience  shows  that  our  ideas  are  productive,  and  the  feedback  from  the  community  indicates 
that  this  work  is  timely  and  of  value. 
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1  Background 


Thif  work  beg»n  daring  a  two-year  visit  at  Comeil  University  1984-86,  as  a  collaborative  effort  with 
John  Hopcroft.  I  had  worked  with  John  before,  so  we  had  a  history  together,  and  an  appreciation 
of  each  other’s  abilities.  My  prior  work  had  been  in  graph  isomorphism  and  computational  group 
theory.  The  narrowness  of  the  computer-science  community  interested  in  this  subject  convinced 
me  tW  I  should  look  for  a  broader,  and  more  applied,  area  of  woric.  So,  I  came  to  Comeil  at  a 
perfect  point  in  time. 

1.1  Context  of  the  Project 

Science  and  manufacturing  technology  are  currently  undergoing  a  major  change  whose  thrust  it 
is  to  computerize  all  constituting  processes  and  automate  design,  analysis,  and  evaluation.  Sign 
posts  of  tUs  restracturing  are  the  name  change  of  the  National  Bureau  of  Standards,  the  inception 
of  major  research  initiatives,  such  as  DARPA’s  DICE  project,  and  the  increasing  interest  in  the 
computational  paradigm  by  the  natural  sciences. 

This  ongoing  restructuring  is  made  possible  by  the  wider  availability  of  larger  and  faster 
computers  that  are,  in  effect,  opening  up  new  dimensions  in  problem  scale  and  detail  that  can 
be  ^ectively  contemplated.  It  necessitates  an  interdisciplinary  effort  to  devise  proper  modds  of 
physical  objects  that  are  amenable  to  interrogation  and  modiilcati<m  through  computing.  Our  effort 
is  at  the  center  of  computer  aided  design,  computer  aided  manufacture,  robotics,  and  computational 
science. 

The  main  thrusts  of  our  work  are  research  in  geometric  and  solid  modeling,  and  research  in 
automating  the  application  of  these  models  in  a  variety  of  endeavors  including  the  simulation  of 
physical  phenomena  and  their  computational  analysis. 

•  In  solid  modeling,  we  investigate  all  levels  of  the  design  and  implementation  process,  includ¬ 
ing  problems  arising  at  the  conceptual  design  level,  problems  to  be  solved  when  extending 
the  geometric  capabilities,  and  problems  dealing  with  the  foundational  issues  raised  by  im¬ 
plementations. 

•  In  physical  simulation,  we  are  refining  the  Newton  system  developed  jointly  with  Comeil  Uni¬ 
versity,  and  concentrate  on  expanding  the  physical  coverage  by  studying  inter&<x  problems 
that  integrate  this  system  with  other,  existing  simulation  and  analysis  systems  of  comple¬ 
mentary  capability. 

Since  its  inception  five  years  ago,  this  work  has  produced  a  rich  variety  of  results  and  prototypes. 
Past  and  present  efforts  are  coextensive  with  related  efforts,  most  notably  the  work  done  by  John 
Hopcroft  at  Cornell.  This  is  most  appropriate  ^ven  the  nature  and  magnitude  of  the  work  to  be 
done. 
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Research  Objectives 


The  goal  of  this  work  is  to  create  a  science  base  for  computer  representation,  analysis  and  mar 
nipnlation  of  models  of  physical  objects,  and  to  develop  the  infrastructure  necessary  to  pve  wide 
applicability  to  the  insights  and  techniques  developed  in  the  course  of  the  project.  At  this  time, 
two  specific  fod  are  in  the  foreground: 

1.  Develop  and  broaden  geometric  and  solid-modding  capabilities. 

2.  Develop  and  broaden  tools  for  modding  and  simulating  systems  of  physical  objects. 

Each  of  these  fod  splits  into  several  subefforts  that  conceptualize  current  problems  and  how 
they  can  be  overcome.  In  geometric  and  solid  modeling,  the  fdlowing  objectives  are  pursued: 

1.  Investigate  what  makes  the  substrata  unreliable  on  which  geometric  modding  is  implemented, 
and  devdop  ways  to  make  it  reliable. 

2.  Devdop  the  algorithmic  and  mathematical  infrastructure  needed  to  enlarge  the  geometric 
coverage  of  modders  and  make  modders  more  efiident. 

3.  Devdop  good  user  interface  languages  ultimatdy  resulting  in  increased  productivity  in  engi¬ 
neering  design. 

In  simulation  and  modeling  of  physical  systems  the  following  objectives  are  pursued: 

1.  Investigate  the  interaction  between  geometric  shape  and  physical  behavior. 

2.  Investigate  modalities  and  design  methods  to  accomodate  changing  modalities. 

3.  Integrate  the  computational  treatment  of  different  physical  aspects,  such  as  motion,  heat 
transfer,  stress  and  vibration. 

The  two  areas  of  research  are  portrayed  in  separate  sections. 


3  Research  Issues  in  Geometric  and  Solid  Modeling 

The  geometric  and  solid  modeling  research  addresses  the  computer  science  aspects  of  how  to  rep¬ 
resent,  manipulate,  and  analyze  the  shape  of  physical  objects  by  computer.  The  work  focuses  on 
three  subareas:  substrata  issues,  infrastructure  issues,  and  user-interface  issues. 

3.1  Problems  in  Geometric  and  Solid  Modeling 
Substrata 

It  is  a  widely  recognized  fact  that  most  geometric  and  solid  modeling  systems  fail  under 
certain  conditions.  Typically,  for  objects  with  surface  elements  that  are  almost,  but  not  quite. 
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coincidcikt,  valid  inpati  to  a  modeling  system  may  fail  to  produce  valid  results  and  could  even 
crash  the  system.  Less  widely  recognized  is  that  these  problems  ultimately  root  in  the  fact  that 
the  modeler’s  infrastructure  of  algorithms  has  been  designed  with  infinite  predsion  arithmetic  in 
mind,  but  is  typically  implemented,  for  efficiency  reasons,  using  fixed  precision  arithmetic,  i.e., 
fioating  pmnt  numbers,  e.g.,  Hoffmann,  Hopcroft,  and  Karasick  (1987).  In  consequence,  certain 
computations  from  which  to  deduce  symbolic  geometric  facts,  e.g.,  incidence  or  nonincidence,  are 
inconclusive  due  to  insufficient  precision.  These  inconclusive  results  must  be  interpreted,  and  from 
them  deductions  must  be  made  based  on  incomplete  information. 


Infnutrmciurt 

Teduuqao  are  needed  to  represent,  modify,  and  interrogate  objects  whose  shi^te  dements  are 
algebraics  of  unrestricted  degree.  This  is  a  bdd  undertaking  that  generalizes  at  once  all  previous 
approaches  to  solid  modding.  For  example,  free  form  surface  design  concentrates  on  working  with 
spedlic  dasses  of  (parameterizable)  algebraic  surfaces. 

Previous  work  in  this  direction  has  been  sfymied  by  severe  problems  arising  when  dealing  with 
high-degree  algebraics.  For  this  reason,  most  modding  systmns  drastically  restrict  the  fypes  and 
degrees  of  the  allowed  surface  dements.  Our  infrastructure  research  makes  use  of  all  available  suc¬ 
cessful  techniques,  induding  numerical  methods,  symbolic  computation,  and  differential  geometric 
techniques,  in  order  to  obtain  the  best  results  possible.  This  pragmatism  is  absolutdy  essential  if 
the  goals  of  accuracy,  effidency,  and  robustness  are  to  be  attained. 

User-Interfaces 

Effective  use  of  solid  modders  currently  requires  spedalists  with  much  training.  There  is  a  need 
to  make  these  systems  accessible  to  the  nonspedalist.  Computer  sdence  should  be  able  to  make 
sophisticated  contributions  in  this  area,  given  the  deep  insights  the  fidd  has  gained  in  programming 
language  design. 

3.2  Researdi  Approach  to  Geometric  and  Solid  Modeling 

Assume  that  we  have  to  make  a  dedsion  based  on  unreliable  numeric  data.  If  we  are  dealing 
with  complex  geometric  objects,  such  as  the  representation  of  solids,  the  dedsion  to  be  made  will 
depend  on  how  we  dedded  other  uncertain  computations  during  earlier  parts  of  the  computation. 
This  interdependence  of  dedsions  is  not  easily  recognized  algmithmically.  Failing  to  recognize 
it,  however,  makes  it  likely  that  we  make  inconsistent  dedsions  which  could  crash  the  algorithm. 
This  is  an  important  research  topic  widely  acknowledged  to  be  of  critical  importance.  Our  work 
approaches  the  substrata  problems  as  follows: 

1.  Develop  a  dear  understanding  of  the  extent  of  the  problem  in  spedhc  applications  such  as 
Boolean  operations  on  solids.  We  are  in  the  process  of  completing  the  toob  needed  for  this 
work,  a  dual-mode  modeler  capable  of  doing  the  same  operation  both  in  floating  point  and 
in  exact  rational  arithmetic,  using  identical  algorithms. 

2.  Investigate  the  logical  complexity  of  reasoning  about  the  consequences  of  the  numerical  ded¬ 
sions.  Past  work  has  developed  the  reasoning  paradigm,  current  work  will  extend  it. 

We  use  algebraic  methods,  numerical  methods,  and  methods  from  differential  geometry.  The 
algebraic  methods  investigate  how  to  apply  techniques  from  algebraic  geometry,  such  as  desingu- 
larization,  and  how  to  maike  computations  such  as  Grobner  bases  more  effective.  An  important 
criterion  here  is  that  the  methods  should  not  involve  excessive  or  intractable  computations. 

The  numerical  approach  seeks  ways  to  increase  accuracy  and  geometric  coverage  by  reformulat¬ 
ing  problems  in  higher  dimensions.  A  dimensionality  paradigm  has  been  formulated,  and  its  utility 
is  currently  under  investigation.  The  objective  here  is  to  reduce  algebraic  degrees  by  introdudng 
more  variables. 
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The  differential  reproach  looks  at  a  variety  of  projection  methods,  seeking  to  determine  which 
classical  techniques  have  potential  in  geometric  and  solid  modeling.  We  restrict  this  work  for  now 
to  these  specific  problems: 

1.  Given  a  point  p  and  a  surface  in  three  space,  determine  the  distance  of  p  from  the  surface, 
and  determine  the  projection  of  the  point  to  a  surface  point  q  of  minimum  distance. 

2.  Given  a  space  curve  and  a  surface,  trace  the  space  curve  and  simultaneously  its  projection 
onto  the  surface.  Here,  each  point  p  on  the  curve  is  projected  to  a  point  q  of  minimum 
distance  on  the  surface. 

Effective  design  seems  to  require  a  notion  of  Teature”.  Proper  definition  of  the  concept,  and 
an  elucidation  of  the  spatial  interaction  of  different  features,  is  a  thorny  problem  that  provokes 
intense  discussions  in  workshops  and  conferences.  No  accepted  definitions  have  emerged  yet,  and 
this  situation  may  persist  for  some  time  to  come.  We  plan  to  approach  the  problem  from  a  different 
perspective:  Given  a  particular  object  in  full  detail,  ‘^approximate”  it  by  deleting  detail  features 
that  are  unimportant  to  the  overall  design.  Thus,  we  attempt  to  derive  a  hierarchy  of  shapes,  each 
progressively  less  detailed. 

To  approach  this  problem  and  ^ve  a  satisfactory  solution  requires  experimentation,  and  we 
have  the  necessary  tools  in  place  to  do  this.  We  plan  to  examine  several  complex  designs,  analyze 
their  features,  give  a  formal  feature  definition,  and  devise  an  approximation  algorithm.  The  results 
of  this  algorithm  must  then  be  inspected,  and  judged  as  to  their  satisfactoriness.  Unsatisfactory 
approximations  can  then  be  analyzed  and  traced  to  possible  fiaws  in  the  feature  definition  or  to 
unexpected  interactions  between  features. 


3.3  Progress  in  Geometric  and  Solid  Modeling 

Past  research  has  isolated  the  sources  of  this  difficulty  in  the  polyhedral  domain.  As  reported 
in  Hoffmann,  Hopcroft  and  Karasick  (1988),  the  difficulties  are  traceable  to  floating  point  arith¬ 
metic  impacting  logical  conclusions  drawn,  such  as  vertex/plane  incidence.  The  paper  also  gives  a 
paradigm  for  approaching  this  problem  and  solving  it  using  symbolic  reasoning  to  ensure  consis¬ 
tency.  A  separate  paper,  Hoffmann  (1989a),  surveys  the  problem  in  the  larger  context  of  geometric 
computations  by  computer,  and  contrasts  our  approach  with  others  proposed  by  the  field. 

We  are  implementing  an  experimental  modeler  which  will  serve  as  a  test  bed  for  analyzing 
which  geometric  errors  can  be  tolerated  and  which  ones  are  fatal.  The  modeler  design  pays  special 
attention  to  the  arithmetic  problem  and  has  two  modes  of  operation,  oue,  in  which  exact  arith¬ 
metic  is  used,  and  another  one  using  floating-point  arithmetic.  Both  versions  work  with  identical 
algorithms.  The  modeler  will  be  used  as  follows: 

1.  Run  the  floating  point  version  until  a  failure  has  been  encountered. 

2.  With  identical  input,  run  the  exact  arithmetic  version. 

3.  If  the  exact  arithmetic  version  also  fails,  then  we  have  uncovered  an  algorithmic  error.  Oth¬ 
erwise,  the  failure  is  a  consequence  of  the  limited  precision  arithmetic. 
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4.  Note  that  by  aaing  the  results  of  the  exact  arithmetic  version  we  can  classify  the  type  of 
failure. 

A  two-dimensional  version  is  already  operaticmal,  and  shows  that  some  of  the  types  of  failure  cited 
in  the  literature  are,  in  fact,  not  failures  due  to  precision  problems,  but  are  programming  errors. 
The  completed  3D  version  will  give  us  a  platform  to  systematically  investigate  the  usefulness  of 
solutions  proposed  by  us  and  others. 

Symbolic  algebraic  methods  have  been  successfully  applied  to  a  variety  of  problems,  including 
tracing  plane  algebraic  curves  through  arbitrary  singularities,  Bajaj,  Hoffmann,  Hopcroft,  and 
Lynch  (1988).  Moreover,  for  specific  problems  such  as  the  elimination  of  variables  from  systems 
of  algebraic  equations,  we  have  developed  what  we  believe  is  the  fastest  known  method.  We  can 
successfully  tackle  problem  sizes  that  cannot  be  solved  by  any  other  approaches  that  have  been 
implemented. 

There  are  many  difficult  surface  operations  that  one  would  like  to  implement  but  cannot  do 
so  because  the  traditional  approach  entails  intractable  symbolic  computations.  Many  of  these  op¬ 
erations  become  almost  trivial  when  reformulated  in  higher  dimensional  spaces.  These  operations 
include  surface  offsets,  needed  in  numerically  controlled  machining,  Voronoi  surfaces,  needed  to 
precisely  formulate  certain  blending  surfaces,  and  blending  surfaces  that  must  satisfy  special  con¬ 
straints  such  as  drcularity  of  cross  section.  That  is,  the  derived  surface  is  formulated  as  a  set  of 
equations  with  more  than  three  variables,  and  this  multi-equational  representation  is  used  directly. 
In  Hoffinann  (1988)  we  demonstrate  that  curves  of  algebraic  degree  well  over  100  can  be  traced 
with  normal  double-precision  floating  point  arithmetic,  to  an  accuracy  of  ten  significant  decimals. 

Work  is  underway  to  examine  the  various  surface  interrogations  of  importance,  and  to  assess 
how  they  might  be  restructured  to  work  with  the  higher  dimensional  version.  These  methods  in¬ 
clude  subdivision  in  higher  dimensions,  as  a  guaranteed  method  to  localize  the  various  branches; 
local  approximations  to  the  surface  without  any  variable  elimination;  and  distance  function  com¬ 
putations. 

Unlike  algebraic  methods,  techniques  from  differentid  geometry  are  not  yet  in  wide-spread 
use  in  geometric  and  solid  modeling.  It  is  not  clear  why  this  is  the  case,  but  we  expect  that  this 
situation  will  change,  and  we  are  exploring  the  utility  of  differential  concepts  in  solid  modeling. 

Joint  with  F.-E.  Wolter,  we  have  developed  several  experimental  programs  to  find  projections 
and  track  the  projection  of  a  curve  to  a  surface.  These  tools  are  presently  unsatisfactory,  but  there 
appear  to  exist  ways  to  improve  them.  These  ways  would  modify  what  is  now  a  classical  differential 
approach,  and  integrate  some  algebraic  techniques. 


4  Research  Issues  in  Physical  Modeling 

Project  Newton  develops  a  highly  modularized  and  extensible  system  to  duplicate  the  precise 
behavior  of  physical  objects  from  their  models.  The  work  is  done  cooperatively  with  John  Hopcroft 
(Cornell)  and  should  have  a  major  impact  on  computer  science,  engineering,  and  manufacturing. 
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4.1  The  Problem 


Simulation  and  analysis  of  physical  systems  is  a  vast  subject  in  which  there  has  been  much  work 
in  nearly  all  branches  of  science  and  engineering.  Despite  this  long  and  illustrious  history,  we  can 
identify  several  gaps  in  the  traditional  approaches: 

1.  Limited  Geometry.  Complicated  shapes  are  not  normally  modeled,  and  the  interaction  be¬ 
tween  shape  and  consequent  physical  behavior  is  relatively  unexplored. 

2.  Fixed  Modality.  Things  are  either  elastically  deforming,  or  they  flow  plastically.  The  change 
from  one  behavior  to  the  other  is  not  modeled. 

3.  No  Mvltiplicity.  Real  objects  behave  and  interact  with  the  environment  in  a  multiplicity 
of  ways.  They  may  simultaneously:  accelerate,  heat  up  or  cool  down,  vibrate,  and  so  on. 
Typically,  only  one  aspect  is  modeled;  any  interaction  between  the  various  aspects  is  ignored. 

While  many  questions  associated  with  these  limitations  belong  to  specific  disciplines  in  science  and 
engineering,  there  is  a  computer  science  component  that  can  and  should  be  investigated.  Moreover, 
the  questions  are  in  many  respects  interdisciplinary. 

4.2  Approach  to  Physical  Modeling 

The  simulation  of  objects  and  their  physical  behavior  is  based  on  the  geometry  of  their  shapes.  fVom 
the  geometric  descriptions,  the  system  formulates  automaticaOy  the  needed  mathematical  models 
that  describe  the  laws  of  possible  mechanical  moti(Mi.  As  the  simulation  progresses,  the  system 
will  reformulate  these  models  as  needed;  for  example,  in  response  to  collision,  or  a  disappearing 
contact  between  two  objects.  Both,  the  automatic  model  formulation  and  the  automatic  model 
modification  are  novel  aspects  of  the  work. 

The  original  system  design  is  based  on  Newtonian  mechanics.  Methods  are  being  explored  to 
overcome  the  intrinsic  limitations  of  Newtonian  mechanics  and  to  increase  the  scope  of  phenomena 
that  can  be  simulated.  We  refer  to  this  research  as  extending  the  physical  coverage,  just  as  our 
research  in  geometric  modeling  aims  at  extending  the  geometric  coverage. 

Extending  the  physical  coverage  does  not  necessarily  involve  breaking  new  ground  in  physics 
or  mechanical  engineering,  since  most  of  the  phenomena  we  would  like  to  simulate  can  already  be 
simulated  by  suitable  finite  element  codes.  However,  finite  element  codes  are  developed  for  specific 
physical  phenomena  in  isolation,  and  we  would  like  to  track  the  phenomena  simultaneously.  More¬ 
over,  these  programs  are  meant  to  be  used  by  specialists.  They  have  limited  geometric  capabilities 
and  limited  automatic  capabilities.  It  is  our  aim  to  interface  the  Newton  system  with  these  codes 
in  such  a  way  that  human  intervention  and  problem  formulation  becomes  largely  unnecessary.  This 
activity  is  in  part  similar  to  software  engineering,  in  that  we  wish  to  combine  existing  complex 
software  systems  with  each  other  without  extensively  rewriting  them. 


9 


4.S  Progr«M  in  Physical  Modeling 


A  second  implementation  is  now  operational,  devdoped  in  Common  Lisp  on  Symbolics  Lisp  ma¬ 
chines.  Its  coverage  includes  geometric  shape,  ripd  body  dynamics,  control  model  evaluation, 
interfmence  detection,  and  coUisitm  simulation. 

An  experimental  interface  to  finite  element  codes  has  been  constructed,  but  it  is  not  yet 
fully  general.  First  experiments  demonstrate  that  it  is  possible  to  extend  the  Newton  system  by 
interfacing  it  with  complementary  software  packages.  Moreover,  this  interface  is  across  physical 
machine  toundaries,  i.e.,  the  Newton  system  is  in  the  process  of  being  distributed  over  a  network 
of  cooperating  machines. 

The  scientific  problems  raised  by  this  interface  include  determining  faithfully  the  initial  values 
entailed  by  the  collision.  A  possible  way  to  derive  the  initial  conditions  for  the  FEM  problem  is  to 
first  determine  the  impulses  resulting  from  the  collision,  and  then  to  assign  the  appropriate  initial 
velocities  to  the  elements  involved.  More  generad  paradigms  are  under  investigation. 
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6  Results  from  Prior  Naval  Support 

We  summarize  the  work  and  accomplishments  due  to  the  prior  support  through  contract  N00014- 
86-K-0465,  during  the  period  of  8/86  through  now. 


6.1  Books 

1.  “Geometric  and  Solid  Modeling”,  to  be  published  by  Morgan  Kauffman,  San  Frandsco,  July 
1989. 

2.  Editor  of  “Issues  in  Robotics,”  JAI  Press,  to  appear  late  1989. 

3.  Editor  of  “Algorithmic  Aspects  of  Geometry  and  Algebra,”  Springer  Verls^;  (with  E.  Kaltofen 
and  C.  Yap). 
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6.2 


Papers  and  Tedinical  Reports 


1.  “The  Potential  Method  for  Blending  Surfaces  and  Comers,”  in  Geometric  Modeling,  G.  Earin, 
ed.,  347-365,  SIAM  1987. 

2.  “Simulation  of  Physical  Systems  from  Geometric  Models,”  IEEE  J.  Robotics  and  Autom., 
RA-3, 1987, 194-206. 

3.  “Geometric  Ambiguities  in  Boundary  Representations,”  Computer  Aided  Design  19,  1987, 
141-147. 

4.  “Projective  Blending  Surfaces,”  Artificial  Intelligence  37, 1988,  357-376. 

5.  “Algebraic  Curves,”  in  Mathematical  Aspects  of  Scientific  Software,  J.  Rice,  ed.,  IMA  Vdnmes 
in  Math,  and  Appl.,  Springer  Verlag,  1988, 101  -  122. 

6.  “Towards  Implementing  Robust  Geometric  Computations,”  Proc.  Conf.  Comp.  Geometry, 
Urbana,  Ill.,  1988. 

7.  “Tracing  Surface  Intersections,”  Computer  Aided  Geometric  Design  5, 1988,  285-307. 

8.  “Model  Generation  and  Modification  for  Dynamic  Systems  from  Geometric  Data,”  Springer 
NATO  ASI  Series  F'50, 1988, 481^92. 

9.  “The  Problem  of  Accuracy  and  Robustness  in  Geometric  Computation,”  IEEE  Computer  33, 
31-42. 

10.  “Local  Implicitizations  of  Curves  and  Surfaces,”  ACM  Trans,  on  Graphics,  to  ^>pear  1989. 

11.  “Robust  Boolean  Operatirms  on  Polyhedral  Solids,”  TR-87-875. 

12.  “A  Dimensionality  Paradigm  for  Surface  Interrogation,”  TR  88-837;  submitted  to  CAGD. 

13.  “On  the  Geometry  of  Dupin’s  Cyclide,”  The  Visual  Computer  5,  to  appear  in  June. 

14.  “Variable  Radius  Blending  with  Dupin  Cyclides,”  to  appear  late  1989. 

6.3  Workshops  Organized 

1.  “Computational  Issues  in  Robotics,”  IMA  Minnesota,  August  1987. 

2.  “Blending  Surfaces,”  Minisymposium,  SIAM  Conf.  on  Applied  Geometry,  Albany  1987. 

3.  “Algorithmic  Aspects  of  Geometry  and  Algebra,”  MSI  Cornell,  July  1988. 

4.  “Applying  Algebradc  Geometry  to  Surface  Intersection,”  Short  course,  SIGGRAPH  88. 

5.  “Computing  about  Physical  Objects,”  Minisymposium,  SIAM  Conf.  on  Applied  Geometry, 
Arizona,  November  1989. 

6.  “The  Computational  Paradigm  In  Science  and  En^neering,”  Symposium  at  the  annual  meet¬ 
ing  of  the  American  Assoc,  for  the  Advancement  of  Science,  New  Orleans,  February  1990. 
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6.4  Invitations  to  Workshops 

1.  NSF  Workshop  on  Geometric  Reasoning,  Oxford,  July  1986. 

2.  IMA  Workshop  on  Snpercompnting,  Minnesota,  March  1987. 

3.  NSF  Res.  Conf.  Geometric  Modeling  and  Robotics,  Detroit,  May  1987. 

4.  Sommer  Program  on  Robotics,  IMA  Minnesota,  August  1987. 

5.  NATO  Workshop  on  CAD  Based  Programming  for  Sensor  Based  Robots,  Italy,  July  1988. 

6.  l^ento  School  on  VLSI  Design  and  Parallel  Algorithms,  Italy,  July  1988. 

7.  MSI  Workshop  on  Grobner  Bases,  Comdl,  1988 

8.  NSF-IFIP  Workshop  on  Solid  Modeling,  Rensselaer,  September  1988 

9.  Oberwolfach  Workshop  on  Applicable  Algebra,  West  Germany,  January  1989. 

10.  NSF  Workshop  on  Information  Technology,  Atlanta,  March  1989 

11.  Oberwolfach  Workshop  on  Surfaces  in  CAGD,  West  Germany,  April  1989 

12.  NATO  School  on  CAGD,  Canary  Islands,  July  1989. 

6.5  Talks  at  Universities  and  Labs 

1.  General  Electric,  Schenectady,  1987 

2.  Courant  Institute,  New  York,  1987 

3.  Carnegie- Mellon  University,  Pittsburgh,  1988 

4.  use,  Los  Angeles,  1988 

5.  University  of  Washington,  1988 

6.  University  of  Maryland,  1988 

7.  University  of  Linz,  Austria,  1988 

6.6  Editorial  Responsibilities 

1.  Editor,  SIAM  Frontiers  Series  on  Applied  Geometry. 

2.  Editorial  Board,  Journal  of  Symbolic  Computation. 

3.  Editorial  Board,  Journal  for  Applicable  Algebra. 

4.  Elditorial  Board,  Computer-Aided  Geometric  Design. 
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6.7  Profeaiioiud  Duties 

1.  Psad,  NASA  CESDIS  Granti,  1988 

2.  Pend,  NSF  CISE-SS  Infrastnictare  Grsnto,  1988 

3.  Site  review,  NSF  CER  progrsm,  Univ.  Rochester,  1988 

4.  Program  committee,  ACM  Symp.  Computstiond  Geometry,  1989 

6.8  Software  and  Tools 

1.  Box  and  rectangle  intersection  aigcmthm.  Used  to  speed  np  polyhedral  intersection  algorithm. 

2.  Dual  mode  polygonal  intersection  algorithm.  Test  case  for  the  dual  mode  polyhedral  inter¬ 
section  algorithm. 

3.  3D  surface  intersection  algorithm.  Explore  the  capabilities  and  limitations  of  purdy  numerical 
approaches. 

4.  Planar  curve  tracing  algorithm  using  desingularization.  Proof  of  concept:  Numerical  and 
symbi^c  computation  can  be  successfully  ccunbined. 

5.  Interface  between  S-geometry  and  Karasick’s  pdyhedral  modeler.  Tool  to  study  user  inter¬ 
faces. 

6.  Newton  system.  Proof  of  concept:  Automatic  modd  construction,  modification,  and  analysis 
from  geometric  data  is  possible. 

7.  Linear  equation  sdver  for  distributed  computatimi.  Part  ofan  effort  to  construct  a  distributed 
version  of  the  Newton  system. 

8.  Interface  between  Newton  system  a.  i  .EARN,  a  structural  mechanics  finite  dement  pack¬ 
age.  Proof  of  concept:  Finte  dement  techniques  can  be  interfaced  with  the  Newton  system. 

9.  Dual  mode  polyhedral  intersection  algorithm  (in  progress).  Test  bed  to  study  robustness 

issues. 

10.  Device  independent  display  algorithm  for  algebraic  surfaces.  Visualization  tool  for  surface 
research. 

11.  Surface  intersection  algmithm  in  arbitrary  dimensions.  Proof  of  concept:  Higher  dimensional 
problem  formulations  can  be  used  directly,  and  have  important  practical  benefits. 

12.  Elimination  algorithm.  Proof  of  concept:  Verify  that  we  can  speed  up  expensive  symbolic 
algmithms  by  reducing  thdr  generality. 

13.  Point  projection  onto  curves  and  surfaces  (in  progress).  Proof  of  concept:  Assess  practical 
usefulness  of  differential  geometry  in  modeling. 
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14.  Ihtck  projactioii  of  &  curve  onto  a  surface  (in  progress).  Proof  of  concept:  Assess  practical 
usefnlnees  of  differential  geometry  in  modding. 

15.  Surface  polygonaliaer,  implicit  or  parametric  surfaces.  Used  for  visualization  and  mesh  gen¬ 
eration. 

Most  of  this  software  has  been  develi^ed  in  Common  Lisp  for  Symbdics  Lisp  machines,  with 
the  exception  of  the  display  algorithm  and  the  higher  dimensional  surface  intersection  algorithms 
which  are  written  in  C  for  Unix  machines. 


7  General  Research  Directions 

The  following  themes  are  considered  to  be  of  critical  importance  to  promoting  the  utility  of  com¬ 
putation  in  science  and  engineering,  especially  in  manufacturing. 

1.  Research  into  the  substrata  problem  in  geometric  modeling.  What  can  we  do  to  design 
correct  implementable  algorithms  with  the  needed  performance?  Can  we  retro-fit  methods 
onto  existing  modelers  that  increase  robustness?  Can  we  devise  “approximate’*  models,  either 
in  the  sense  of  tolerance,  or  in  the  sense  of  statistical  variation? 

2.  Integrate  symbolic  and  numerical  methods.  Very  few  examples  can  be  cited  in  which  the  best 
aspects  of  each  approach  have  been  combined.  There  should  be  many  trade-offs,  but  what 
we  know  is  only  anecdotal. 

3.  Develop  conceptual  geometric  design.  What  is  a  feature?  What  is  design  detail,  what  is 
conceptual  design?  At  this  time,  even  case  studies  would  be  useful.  Case  studies  might 
consider  specific  applications  in  aircraft  wing  design,  ship  hull  design,  and  space  applications. 

4.  Develop  conceptual  functional  design.  What  is  the  interaction  between  feature  and  function¬ 
ality?  How  do  tolerances  affect  functionality? 


8  Hilbert' Size  Problems? 

1.  Given  a  3-dimensional  geometric  object,  remove  all  surface  structures  of  size  smaller  than  a 
given  tolerance  e. 

2.  Given  an  algebraic  equation  /  =  0  in  the  variables  zi,  ...,Zm  and  of  degree  n.  Find  k  algebraic 
equations  hi  =  0,...,hjb  =  0,  in  m  -I-  r  variables,  such  that  the  algebraic  degree  of  each  hi  is 
strictly  less  than  n  and  the  projection  of  the  algebraic  set  defined  by  the  h,-,  onto  the  subspace 
defined  by  the  zi,  ...,Zm,  is  the  algebraic  set  of  f} 

3.  Given  n  rigid  objects  in  contact,  each  acted  upon  by  known  external  torques  and  forces, - 
devise  an  efficient  algorithm  to  determine  ail  contact  forces. 

*For  example,  given  a  qnartic  curve  /(z,y)  s  0,  can  it  be  obtained  as  projection  of  the  intersection  of  two  quadrics, 
hi(x,p,*)  =  0  and  h}(z,y,r)  =  0? 
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4.  Give  a  deAnition  of  feature  and  show  tliat  it  is  anambignoas.  Then  devise  a  recognition 
algorithm. 
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0.  Abstract 

Tha  davalopaant  of  softwara  anglnaaring  as  a  dlsciplina  has  baan 
influancad  substantially  by  tha  davalopaant  of  formal,  aathaaatlcal 
tachnlquas  for  raasonlng  about  coaputar  programs.  Ona  of  tha  most 
promising  avanuas  of  rasaarch  Is  to  davalop  formal  aathaaatlcal 
thaorias  for  spaclfylng  and  proving  tha  corractnass  of  coaputar 
systaas.  Tha  kay  Idaa  hara  Is  that  If  ona  producas  a  proof  that  a 
computing  systaa  satlsflas  Its  spaclflcatlon,  than  tha  only  raasons 
for  which  tha  systaa  can  fall  to  work  as  Intandad  ara  (a)  hardwara 
failura  and  (b)  tha  failura  of  tha  formal  spaclflcatlons  to  captura 
Intandad  bahavlor.  Bacausa  tha  davalopaant  of  formal  proofs  Is  itsalf 
a  vary  arror-prona  and  tadlous  activity,  wa  hava  baan  pursuing, 
partially  with  ONR  support,  tha  davalopaant  of  a  coaputar  program  for 
chocking  proofs  of  corractnass  of  c<»putar  systaas.  Na  hava  aada 
substantial  prograss  In  this  dlractlon  using  constructlva  aathaaatlcal 
thaorias,  and  with  ONR  support  wa  ara  continuing  to  axtand  our  work 
through  aathaaatlcal  thaorias  of  graatar  powar. 

1.  Backgroimd. 

Boyer  and  Moora  began  collaboration  on  their  aachanizad  logic  and 
theoraa  provar  in  tha  early  1970s.  A  summary  of  thalr  work  through 
1979  is  given  in  [ACL79].  The  following  excerpt  is  taken  from 
[ACLH88]  and  describes  soma  of  tha  research  stimuli  for  Boyer  and 
Moore  during  tha  last  few  years. 

....  perhaps  tha  aost  important  change  since  tha  publication  of 
"A  Computational  Logic"  was  that  in  1981  wa  moved  from  SRI 
International,  where  we  ware  involved  exclusively  in  research,  to 
tha  University  of  Texas  at  Austin.  Our  research  home  at  the 
University  of  Texas  was  the  Institute  for  Computing  Science. 
However,  as  professors  in  tha  Department  of  Computer  Sciences,  we 
teach.  In  1981  we  began  teaching  a  graduate  course,  now  called 
"Recursion  and  Induction",  on  how  to  prove  theorems  in  our 
logic,  and  we  initiated  a  weekly  logic  seminar  attended  by 
graduate  students,  other  faculty  members,  and  logicians  from 
local  research  organizations.  These  efforts  dramatically 
increased  the  number  of  people  familiar  with  our  work.  In 
addition,  we  began  using  the  theorem  prover  to  check  the  proofs  of 
theorems  we  wanted  to  present  in  class  (e.g.,  the  unsolvability 
of  the  halting  problem) . 

Kaufmann  first  became  involved  in  this  work  in  the  context  of  adapting 


th«  Boysr-Moor*  prov«r  to  functional  language  varlficatlon  trtiila  at 
tha  Burroughs  Austin  Rasaarch  cantar,  1984*>86.  Ha  jolnad  tha 
Znstituta  for  Computing  Scianca  at  tha  Univarsity  of  Taxas  (%rtiara 
Boyar  and  Noora  vara  locatad)  irhan  that  Cantar  was  closed  down  in  tha 
sussMir  of  1986.  His  previous  axparianca  as  a  sathaaatical  logician 
has  helped  stisulata  tha  currant  push  to  add  capabilities  in 
first-order  quantification  and  sat  theory. 

2.  Rasaarch  Objectives. 

Tha  long  range  objectives  of  tha  rasaarch  include 


**  enabling  prograssars  to  produce  software  that  is  nathaaatically 
proven  to  neat  its  specifications  by  using  nachanical 
thaorm-proving  prograas  that  check  proofs 

**  supporting  proofs  of  correctness  of  coiq>uting  systass  to  provide  a 
trusted  base  for  those  applications 

3.  Rasaarch  Issues. 

Tha  key  idea  hare  is  that  if  there  is  a  proof  that  a  coiq>utlng  systan 
satisfies  its  specification,  than  tha  only  reasons  for  which  tha 
systan  can  fail  to  work  as  intended  are  (a)  hardware  failure  and  (b) 
tha  failure  of  tha  formal  specifications  to  capture  intended  behavior. 

4.  Approach. 

Tha  provar  was  developed  as  program  verification  system  partially 
under  ONR  support.  A  basic  version  of  the  logic  and  provar  is 
documented  in  (ACL79].  Enhancements  to  the  logic  and  provar  are  built 
on  that  base  in  our  approach.  A  more  up-to-date  version  of  the  logic 
and  provar,  documented  in  [ACLH88],  illustrates  this  approach  by 
documenting  the  following  extensions  to  the  provar  and  logic. 

**  a  hints  facility  which  allows  a  significant  measure  of  user 
control  over  the  provar 

**  a  fully  integrated  decision  procedure  for  linear  arithmetic 

**  a  facility  for  "metatheoretic  extensibility”,  i.e.  for  allowing 
the  user  to  extend  the  theorem  provar  in  a  provably  sound  manner 

**  NQTHN:  an  extension  to  support  partial  functions,  bounded 
quantification,  and  an  interpreter  for  the  logic  within  the 
logic 

Three  principles  are  fundamental  to  our  approach. 

**  The  logic  is  completely  specified  and  the  prover  is  Implemented 
with  extreme  care  so  that  it  soundly  implements  the  logic. 

**  The  criterion  for  success  of  the  prover  is  its  successful 
application  to  specific  theorems  to  prove. 

**  Prover  use  is  an  important  part  of  the  process  of  deciding  how  to 
extend  its  capabilities. 


iMt  US  nets  that  sany  othar  forsal  sathods  axist  %rhich  hava  varying 
danaas  of  sachanlcal  proof  support.  Somm  of  thasa,  such  as  tha 
Edinburgh  Logical  Prasairork,  ara  concamad  sora  with  foundations  than 
with  applications  to  progras  varification.  Othars  such  as  Z  and  VDM 
(which  ara  mining  p^ularity  aspacially  in  Europa)  apphasiza  forsal 
raasoning  without  tha  banafit  (or  burdanl)  of  sachanlcal  proof 
support,  hsong  thosa  systass  which  aaqphasiza  sachanlcal  proof 
support,  ours  is  unsurpassed  in  tha  collect ion  of  thaorass  which  hava 
bean  proved  (or  "proof-chackad")  using  tha  systas.  Tha  following 
section  shows  that  tha  logic  and  its  proof  support  ara  vary  live 
research  areas  and  wa  feel  quite  strongly  that  our  sost  productive 
research  route  is  to  build  heavily  on  our  previous  work.  This  say 
involve  sosa  rather  serious  changes;  for  axaspla,  tha  VftC$  and  EVAL$ 
approach  to  bounded  quantification  say  be  replaced  by  different 
additions  to  tha  logic.  However,  wa  believe  that  our  basic  approach 
of  using  recursion  and  induction  with  a  forsal  constructive  logic  is  a 
sound  one  on  «diich  wa  will  continue  to  build.  Ha  will  continue  to  gat 
feedback  fros  users  of  tha  systas  as  a  saans  for  improving  its 
utility. 

5 .  Payoffs 

Tha  ultisata  payoff  of  this  technology  is  tha  ability  to  sachanlcal ly 
proof-check  sathasatical  properties,  aspacially  of  coaqputar  software 
and  hardware.  Many  individuals  hava  successfully  sachanically 
proof -checked  thaorass  in  tha  following  areas:  alasantary  list 
processing,  nusbar  theory,  satasathasatics,  bounded  quantification, 
cossunication  protocols,  concurrent  algorithss,  Fortran  prograss,  real 
tine  control,  assasbly  language  isplasantation,  operating  systas 
isplasantation,  cospilar  correctness,  hardware  varification,  and  sat 
theory.  Several  of  these  efforts  hava  bacosa  tha  sain  cosponants  of 
doctoral  dissertations. 

Another  kind  of  payoff  is  tha  prograss  that  our  approach  has  had  in 
improving  tha  sachanlcal  tools  for  tha  logic  and  even  tha  logic 
itself.  Such  progress  includes: 

**  an  interactive  proof-checker  enhancement 
**  an  improved  facility  for  reusable  theories 

**  an  extension  to  the  logic  and  prover  to  allow  "partial  definitions" 
and  fiinctional  instantiation 

**  preliminary  extensions  to  the  logic,  prover,  and  proof-checker  to 
allow  first-order  quantifiers  and  a  rudimentary  capability  in  set 
theory 

**  an  experimental  extension  to  improve  reasoning  power  for  equivalence 
relations 

**  an  experimental  extension  to  the  execution  environment  to  allow 
constant-time  array  access  and  update  while  remaining  in  a  purely 
functional  freunework 

6.  Research  Directions 

The  following  subsections  lay  out  some  of  our  current  research 
directions  in  automated  reasoning  and  program  verification.  We  also 
are  involved  in  a  number  of  applications  of  this  technology.  For 
exeunple,  one  of  our  primary  focuses  at  Computational  Logic,  Inc.,  is 
on  "trusted  systems",  i.e.  on  provably  correct  implementations  of 
high-level  languages  on  hardware.  Our  position  paper  for  the  upcoming 
ONR-sponsored  "Workshop  on  Directions  in  Software  Analysis  and  Testing 
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roNRllorksh^]  givM  an  ovarviaw  of  this  lino  of  rasaarch.  And  thara 
is  also  rasaarch  undarvay  in  tha  application  of  tha  Boyar-Moora  logic 
and  its  proof  support  to  tha  nachanical  varification  of  propartias  of 
distributad  progr  ns  and  floating-point  algorithns.  Howavar,  va 
confina  oursalvas  balow  to  thosa  araas  ralatad  to  axtanding  our 
cabilitias  in  autoaatad  raasoning  and  program  varification. 

a.  First-ordar  quantification  and  sat  thaory 

Raasoning  about  co^putar  systams  raquiras  skill  in  two  distinct  typas 
of  mathaaatical  thaorias:  tha  conatructiva  and  tha  sat  thaoratic.  On 
ona  hand,  arguaants  must  ba  mada  about  tha  alamantary  construct ibla 
objects  that  ona  actually  finds  in  coaputars,  such  as  intagars,  finite 
lists,  and  strings.  On  tha  other  hand,  in  order  to  specify  tha 
interaction  of  computing  systems  with  tha  real  world  and  to  specify 
tha  interconnection  and  intardapandanca  of  computing  systems,  ona 
needs  tha  full  range  of  mathematical  concepts,  such  as  are  usually 
develop^  within  sat  thaory.  For  axaiqpla,  both  of  tha  systems 
currently  approved  by  tha  DoD  for  A1  security  level  certification, 
Gypsy  and  FOH,  utilize  sat  thaory. 

Under  support  from  OMR,  wa  have  developed  a  program  specification  and 
varification  system  which  is  tinsurpassad  in  its  facility  for  making 
infarancas  within  a  constructive  thaory.  So  far,  however,  no  program 
varification  system  has  bean  developed  in  sat  thaory  with  cosparabla 
power.  Recant  progress  made  under  OMR  support  suggest  a  method  for 
adding  sat  thaory  and  ralatad  mathematical  concepts  to  our  system. 

Tha  main  idea  is  to  add  an  interface  from  first-order  logic  to  tha 
Boyar-Moora  logic,  and  to  generalize  tha  notion  of  Skolamization  to 
extend  this  interface  to  expressions  that  contain  sat-buildar  notion. 
This  rasaarch  involves  both  (a)  theoretical  rasaarch  in  tha  selection 
and  formulation  of  a  precise  sat  thaory  and  quantification  thaory,  (b) 
practical  rasaarch  in  tha  i^plamantation  of  a  thaoram-provar  capable 
of  making  automatic  infarancas  about  questions  in  tha  selected  thaory, 
and  (c)  demonstrations  of  tha  applicability  of  tha  developed 
theorem-proving  techniques  (ultimately,  to  tha  varification  of 
substantial  computing  systems) . 

Wa  do  not  currently  intend  to  add  first-order  quantifiers  and  sat 
thaory  as  primitives  in  tha  logic.  Such  a  radical  decision  would 
probedily  require  wholesale  recoding  of  tha  theorem  provar,  for  example 
because  of  bound  variables,  and  possibly  soma  wholesale  reworking  of 
its  heuristics,  which  currently  are  based  largely  on  tha 
recursion-induction  duality  and  rewriting  but  not  unification. 

Instead,  wa  are  pursuing  em  approach  which  uses  Skolemization  as  an 
interface  from  first-order  logic  to  the  constructive  logic  currently 
in  use.  Wa  have  already  enjoyed  preliminary  success  in  this  venture 
by  proof -checking  formalizations  of  Cantor's  theorem  that  the  power 
set  of  a  set  is  not  of  the  sane  cardinality  of  the  set,  of  Koenig's 
tree  lemma,  of  the  infinite  exponent-2  Ramsey  Theorem,  and  of  the 
Schroeder-Bemstein  Theorem. 

In  order  to  extend  the  set-theoretic  capabilities  of  the  system,  our 
initial  approach  will  be  to  introduce  sets  as  objects.  However,  we 
will  provide  "set-builder  notation"  only  as  syntactic  sugar. 
Specifically,  we  believe  that  we  can  successfully  extend  Skolemization 
from  first-order  logic  into  the  realm  of  set  theory  by  using  it  to 
eliminate  set-builder  expressions. 


N«  will  •xtmd  and  usa  tha  Intaractiva  proof -chackar,  PC-mqthn,  to 
bagin  to  look  for  usaful  proof  aathods  and  haurlstics  in  tha  axtandad 
logic.  Moat  of  tha  axtansiona  to  PC-NQTHM  will  ba  in  tha  fon  of 
■aero  coBoaanda  that  axpand  into  saquancaa  of  prinitiva  cOTsanda;  thia 
■aero  facility  ia  already  in  heavy  uaa  and  givaa  ua  tha  ability  to  add 
new  fiinctionality  to  tha  proof  checker  without  riaking  aoundnaaa. 
Extanaiona  to  tha  logic  are  aoat  aaaily  accoapliahad  in  PC-NQTHM  where 
it  ia  unnacaaaary  for  ua  to  iapleaant  hauriatic  controla  on  tha 
applicationa  of  new  rulaa  of  inference  or  axioma.  Once  PC-NQTHM  ia 
axtandad  wa  will  bagin  to  uaa  it  to  prove  aany  theoraoa  in  sat  theory 
and  related  applicationa.  Aa  wa  develop  and  codify  tha  hauriatica  for 
■anipulating  tha  new  concapta  wa  will  add  new  proof  hauriatica  to 
NQTHM. 

Wa  will  probably  invaatigata  these  issues  a  bit  aora  before  bringing 
in  a  graduate  student,  in  order  to  provide  reasonable  guidance.  If  we 
do  not  find  a  graduate  student  Interested  in  pursuing  this  area  for 
dissertation  research,  wa  will  probably  hire  a  couple  of  students  to 
exercise  a  systea  that  we  build. 

Note  that  we  do  not  in  general  encourage  students  to  build  new 
general-purpose  autoaated  reasoning  systeas.  The  Boyer-Moore  prover 
is  the  product  of  soae  30+  nan-years  of  effort;  hence  we  expect  it  to 
continue  to  be  rare  that  soaeone  with  United  experience  can  build  a 
state-of-the-art  general-purpose  theorea-prover.  However,  we  do 
believe  that  the  pursuit  of  provers  with  selected  strengths  is  a 
reasonable  research  topic  for  a  graduate  student.  In  particular,  as 
we  discussed  above,  we  expect  to  support  a  student  under  our  current 
ONR  contract  to  pursue  research  in  the  directions  outlined  in  the 
paragraphs  above. 

b.  Other  extensions  of  the  prover 

We  plan  to  extend  NQTHM  towards  an  open-architecture  foraal  reasoning 
systea.  This  will  include  inpleaenting  an  advanced  library  nechanisn, 
continuing  to  develop  and  disseainate  reusable  theory  libraries, 
iaplementing  equivalence  reasoning,  increasing  support  for  teaa  proof 
developaent,  extending  the  interactive  proof  checker,  and  studying 
integration  with  the  Argonne  prover,  OTTER. 

We  also  plan  to  extend  NQTHM' s  heuristics  to  include  congruence-based 
rewriting.  This  can  be  seen  as  a  step  toward  an  open-architecture 
system:  the  current  NQTHM  has  many  special-purpose  heuristics  for  one 
built-in  priaitive  ("«")  that  can  be  made  more  generally  available  for 
user-defined  relations  at  the  cost  of  formalizing  the  interface 
(namely,  the  idea  of  congruence  relations) .  We  will  also  look  for 
other  ways  in  which  NQTHM 's  special  purpose  heuristics  can  be  "opened 
up"  so  that  users  can  get  the  power  of  those  heuristics  applied  to 
more  general  concepts. 

c.  Lisp  verification 

One  recent  exciting  area  has  been  the  application  of  our  system  to  the 
proofs  of  properties  of  Lisp  programs.  More  precisely,  we  have  built 
a  system  for  reasoning  about  programs  written  in  a  language  which 
satisfies  the  language  definition  requirements  for  a  subset  of  Common 
Lisp.  We  call  this  language  "Rose  Common  Lisp"  or  "RCL".  The 
existing  prototype  system  (see  [RCL])  is  actually  a  translator  from 
Common  Lisp  definitions  to  definitions  in  the  logic  supported  by 
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HQTKN;  b«caus«  of  th«  Lisp-lik«  naturo  of  NQTHN'a  logic  and  tha  cara 
takan  in  dafining  tha  translator,  aany  RCL  prograna  translata  to 
naarly  idantical  NQTHH  functions.  Howavar,  tha  systaa  handlas 
non-applicativa  constructs  such  as:  assignaant,  both  for  local  (LET) 
and  global  (spacial)  variablas;  axplicit  flow  of  control,  both  with 
local  and  non-local  axits  (CATCH  and  THROW,  BLOCK  and  RETURN-FROM)  and 
with  "go-to”  (PROG) ;  proparty  lists;  and  sacro  dafinition. 

Tha  significanca  of  this  affort  lias  largaly  with  tha  fact  that  tha 
languaga  in  quastion  is  an  inplanantation  of  Common  Lisp.  To  tha  bast 
of  our  knowladga  thara  do  not  axist  varification  systams  for  any  "raal 
languagas”  which  hava  avan  tha  powar  of  tha  axisting  prototypa.  By 
"raal  languagas"  hara  wa  maan  onas  that  ara  dialacts  of  languagas  in 
avaryday  usa  by  programmars  who  hava  no  spacial  intarast  in  formal 
varification. 

Wa  will  work  many  mora  small  examplas  in  tha  coursa  of  furthar 
davaloping  tha  systam.  However,  wa  propose  to  demonstrate  tha 
feasibility  of  our  approach  by  verifying  a  significant  application, 
such  as  a  portion  of  N(2THM  (e.g.,  the  "typa-sat"  mechanism  which 
determines  tha  type  of  an  expression,  the  "clausify"  mechanism  that 
converts  an  IF-expression  into  clausal  form,  the  pattern  matcher  that 
finds  instances  of  rewrite  rules) . 

To  support  technology  transfer,  RCL  will  ultimately  be  trrittan  in  RCL, 
in  a  manner  such  that  we  expect  it  to  run  correctly  on  any  Common  Lisp 
implementation.  It  will  therefore  be  highly  portable.  In  additior, 
wa  will  carefully  document  tha  final  systam  so  that  it  is  accessible 
to  Lisp  programmars  outside  of  Coiq>utational  Logic,  Inc. 

7.  Grand  challenge. 

OK,  how's  this? 

Prove  the  correctness  of  the  implementation  of  a  functional 
programming  language  with  respect  to  its  denotatlonal  semantics. 

Or  this? 

Formalize  the  real  numbers  using  set  theory  (by  way  of  Dedekind  cuts, 
say) ,  and  prove  some  properties  of  the  reals  as  well  as  some 
properties  of  some  simple  algorithms  over  the  reals. 

Actually,  a  more  closa-to-home  goal  is  to  extend  the  CLI  "verified 
stack"  work  [ONRWorkshop]  to  provably  correct  running  execution 
environments,  encompassing  the  high-le-'  ^1  language  level  down  to  the 
register-transfer  hardware  level. 

8.  Research  transitions. 

A  near-term  beneficiary  of  this  research  is  anyone  who  wishes  to 
formally,  and  with  assurance,  prove  mathematical  properties.  In 
particular  such  properties  might  be  correctness  of  software  and 
hardware  systems,  and  in  that  sense  we  are  already  our  own  customer  at 
Computational  Logic,  Inc.,  with  the  various  "trusted  systems"  proofs. 
But  who  outside  our  group  might  be  interested  in  mechanical 
verification  of  mathematical  properties? 

The  research  community  is  of  course  one  obvious  place  to  look  for 


thos«  who  Bight  take  advantage  of  our  work.  Soae  of  the  obvious  ways 
we  will  continue  to  transfer  this  technology  to  that  coBBunity  are  by 
way  of  books,  technical  reports,  journal  publications,  and 
conferences.  In  addition,  all  three  of  us  teach  soBe  courses  at  the 
University  of  Texas  at  Austin,  and  of  course  this  spreads  the 
technology  into  the  university  conatinlty.  (Of  particular  relevzmce  to 
our  current  OMR  contract  is  the  fact  that  KaufBann  will  be  teaching  a 
graduate-level  Set  Theory  course  this  fall.) 

The  Rose  CoBBon  Lisp  project  described  above  is  one  Bethod  we  see  for 
bringing  this  research  into  practical  use  by  software  engineers  whose 
priBary  interest  is  not  prograB  verification  or  autOBated  reasoning. 
With  that  project  we  envision  opening  up  this  technology  to  the 
general  coBBunity  of  Lisp  prograBBers. 

Our  group  is  also  investigating  collaboration  with  hardware 
Banufacturers  in  using  our  Bsthods  to  help  produce  correct  hardware. 

The  Boyer-Moore  theoreB  prover  is  actually  a  highly  Interactive  tool. 
Nevertheless,  sobs  users  have  found  that  the  enhanced  interactive 
capabilities  offered  by  the  PC-NQTHM  proof -checker,  which  was 
developed  prinarily  under  ONR  support,  provide  a  useful  interface  to 
the  logic.  We  will  continue  to  Baintain  PC-NQTHM  to  keep  it 
up-to-date  with  the  latest  version  of  NQTHM. 

PrograB  verification  is  still  a  priBary  Intended  use  of  the  results  of 
our  research.  However,  in  spite  of  the  practices  outlined  above, 
there  is  still  a  large  gap  between  what  we  do  on  a  daily  basis  with 
our  tools  at  Coaputational  Logic,  Inc. ,  emd  what  the  rest  of  the  world 
is  doing.  We  need  to  look  for  nore  ways  to  bring  our  work  into  the 
nainstrean.  An  exanple  of  a  rather  recent  Bove  in  that  direction  is 
the  addition  of  the  book  Bechanlsm  to  the  prover,  which  will 
facilitate  reuse  of  theories  and  teas  collaboration  in  Buch  the  saBe 
way  that  Bodern  software  development  environments  are  supposed  to 
encourage  reuse  of  code  and  programmer  cooperation.  But  we  need  to 
keep  looking  for  more  ways  to  bring  our  technology  into  the  mainstream 
of  the  software  development  process. 

ij.  Technological  impacts. 

Everyone  likes  more  computing  power.  Nevertheless,  at  this  time  we 
are  reasonably  content  with  our  equipment.  We  do  anticipate  a  use  for 
parallel  architectures,  especially  for  the  rapid  replay  of  proof 
files.  In  fact  we  have  recently  constructed,  with  ONR  support,  such  a 
capability  on  a  network  of  Unix  machines. 

10.  Societal  Issues  and  Miscellaneous  Flaming. 

We're  happy  that  our  contract  is  multi-year.  Proposal  writing  is 
extremely  time-consuming. 

11.  Recommendations  to  Funding  Agencies 


«none» 
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1  Background 

Our  research  for  more  than  ten  years  has  focused  on  envirosunents  for  editing  complex 
structured  objects:  computer  programs,  proofs  of  theorems,  program  specifications,  spread¬ 
sheets,  and  the  like.  Our  premise  has  been  that  integrated  systems  that  provide  immediate 
feedback  during  the  creation  and  transformation  of  these  objects  provide  substantial  im¬ 
provements  in  productivity  over  traditi<mal  "batch”  systems.  The  most  frequently  cited 
example  of  this  type  of  system  is  a  programming  envircmment  that  tightly  couples  tools  for 
program  editing,  browang,  analysis,  transformation,  execution  and  debugpng. 

Our  early  work  in  this  area  culminated  in  the  development  of  the  Cornell  Program 
Synthesizer  [TRSl],  a  highly  integrated  environment  for  a  small  subset  of  PL/I.  The  Syn¬ 
thesizer  graphically  demonstrated  the  feasibility  of  building  a  self-contained,  highly  inter¬ 
active  environment  that  supplanted  many  of  the  traditional  batch-oriented  development 
tools.  Whereas  it  was  far  too  limited  to  serve  as  a  tool  for  professional  programmers,  it  was 
used  successfully  as  a  tool  for  teaching  top-down  structured  programming.  In  the  space  of 
several  years  it  served  more  than  20,000  introductory  programming  students  at  Cornell  and 
other  universities. 

It  quickly  became  apparent  that  "hard-coding”  an  environment  for  a  specific  language, 
as  was  done  with  the  Synthesizer,  was  the  wrong  approach.  The  most  chaQenging  aspect  of 
building  Synthesizer-like  systems  is  the  problem  of  efficiently  maintaining  derived  context- 
sensitive  information  as  the  underlying  object  changes;  for  example,  updating  object  code 
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after  each  editing  modification  to  source  code.  We  believed  that  this  problem  waa  amenable 
to  a  generic  solution  and  accordingly  embarked  on  a  study  of  incremental  computation. 


2  Research  Objectives 


Our  long  term  objective  has  been  to  develop  a  comprehensive  theory  of  incremental  compu- 
tati<Hi  that  allows  cost-efective  re-use  of  previous  executions.  We  maintain  that  incremental 
computation  has  the  potential  to  reduce  processing  time  significantly  for  a  wide  variety  of 
applicatimis. 

Our  interest  in  incremental  computation  stems  from  a  narrower  and  more  immediate 
objective  of  improving  productivity  by  showing  how  to  design  and  implement  effective  pro¬ 
gramming  environments  and  environments  for  formal  reasoning.  We  maintain  that  such 
environments  can  make  excellent  use  of  the  methods  we  develop  for  incremental  computa¬ 
tion. 


3  Research  Issues 


The  problem  of  incremental  computation  can  be  poeed  in  the  blowing  terms.  Let  T  be 
a  computable  function  mapping  X  -*  Y,  two  arbitrary  domains.  Let  xotXi,...,Xn  be  a 
sequence  of  values  in  X,  where  the  distances  between  successive  values  of  z  are  small, 
for  some  notion  of  distance  in  X.  We  wish  to  compute  ^{xo\^{x\),  ...,^{xn)  in  an  on¬ 
line  fashion;  i.e.,  each  ^{xi)  must  be  computed  without  reference  to  subsequent  values 
z,.fi,...,Zn.  We  say  that  T  is  computed  incrementally  if  each  computation  of  ^(z,-)  takes 
advantage  of  the  fact  that  we  have  already  computed  ^(zq),  ...,.7^(z,_i). 

By  definition,  then,  incremental  computation  is  a  subject  with  broad  applicability.  Since 

X,  and  Y  are  arbitrary,  we  see  incremental  evaluatim  as  a  research  area  whose  domain 
spans  a  wide  spectrum  of  computable  problems.  ^  might  be  a  function  that  inverts  a  matrix 
z,  finds  the  transitive  closure  of  graph  z,  compiles  a  high-level  language  program  z,  loads 
a  set  of  object  modules  z,  or  computes  navigational  information  for  an  aircraft  from  sensor 
data  z. 

The  significance  of  increment^il  computation  lies  in  its  potential  to  dramatically  reduce 
processing  time  across  this  broad  problem  domain.  In  non-incremental  computations,  each 
evaluation  of  .^(z,-)  is  ignorant  of  the  intermediate  results  of  any  previous  execution.  Yet, 
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in  many  applications,  when  x  changes  jnst  slightly,  the  balk  of  the  compntational  steps 
involved  in  computing  remain  the  same.  Fbr  example,  recomputing  a  matrix  oper¬ 

ation  after  the  change  of  a  single  cell  can  often  reuse  intermediate  results  from  a  previous 
computation.  Similarly,  the  object  code  produced  by  a  compiler  after  a  single  line  change 
in  a  program  differs  little  from  that  before  the  change.  Incremental  computation  can  of¬ 
fer  a  dramatic  savings  in  machine  cycles  by  re-using  unchanged  intermediate  results  from 
previous  computations  of  and  only  evaluating  the  subset  of  the  problem  that  is  affected 
by  the  change  from  x,  to  Xi^i .  By  bypassing  complete  reevalnation  of  previous  intermedi¬ 
ate  results,  incremental  algorithms  can  have  asymptotically  better  running  time  than  the 
non-incremental  alternative. 

We  believe  that  the  general  application  of  incrementality  will  accelerate  the  trend  that 
has  shortened  computer  response  time  to  changing  inputs.  Just  as  improved  hardware  and 
systems  software  precipitated  the  transition  from  batch  to  interactive  systems,  incremental 
evaluation  allows  what  we  call  immediate  systems.  This  computational  modd  resembles 
the  well-known  spreadsheet,  in  which  rapid  recomputation  permits  a  'Svhat-if*  approach 
to  problem  solving.  By  experimenting  with  various  inputs  to  a  problem,  the  user  of  the 
spreadsheet  can  arrive  at  an  optimal  solution.  Incremental  computation  ^plied  broadly 
can  extend  this  spreadsheet-like  interaction  to  complex  problems  like  interactive  theorem 
proving  systems  and  programming  envir<xunents  where  program  results  are  immediately 
reevaluated  as  input  and  source  are  edited. 

Although  immediate  systems  are  the  most  natural  application  of  incremental  evalua¬ 
tion,  this  technique  is  also  effective  in  systems  where  a  user  is  not  directly  involved  in  the 
computational  process. 

In  real-time  applications,  for  example,  quick  response  to  asynchronous  changes  in  in¬ 
dividual  sensor  values  is  required.  Viewing  the  collection  of  sensor  data  as  a  vectcn  input 
quantity  x  and  the  required  setting  of  control  parameters  as  the  result  of  a  vector  valued 
function  T,  such  a  program  must  respond  to  each  change  in  a  component  (or  subset  of  the 
components)  of  the  input  x.  Incremental  computation  of  T  in  response  to  small  changes  in 
sensor  data  may  be  required  in  order  to  meet  real-time  requirements. 

Any  program  that  alternates  data  analysis  with  data  transformation  may  be  a  candi¬ 
date  for  incremental  evaluation.  Consider,  for  example,  the  code  optimization  phase  of  a 
compiler.  Typically,  extensive  data  flow  information  is  gathered  by  analysis  of  the  pro¬ 
gram  control  flow  graph.  Based  on  this  information,  the  compiler  selects  a  code-improving 
transformation,  which  produces  some  modification  of  the  control  flow  graph.  In  general, 
the  modificatim  of  the  control  flow  graph  invalidates  the  derived  data  flow  information, 
which  must  then  be  updated  before  the  next  transformation  can  be  selected  and  applied. 
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Using  the  notntion  described  enrlier,  T  is  the  function  thnt  compotes  dntn  flow  informntion 
from  n  control  flow  gmph  ugoment.  Ench  optimizing  program  transformation  represents 
an  incremental  change  in  the  contrcd  flow  graph.  Repetitive  application  of  to  slightly 
different  control  flow  graphs  suggests  that  the  incremental  computation  of  T  could  improve 
the  performance  of  the  code  optimization  phase. 


4  Approaches 


Much  research  into  incremental  computation  has  taken  an  ad  hoe  or  aigorithm-dependerU 
approach.  In  this  approach,  an  existing  algorithmic  srdution  to  a  spedflc  problem,  <4,  is 
modified  to  produce  a  new  algorithm,  A\  that  computes  a  result  in  response  to  changes 
in  X.  For  example,  [RM87]  and  [Zad84]  both  describe  incremental  approadies  to  the  data 
flow  analysis  problem  described  above.  The  advantage  of  restricting  the  domain  of  the 
problem  to  specific  algorithms  is  that  each  problem  may  offer  its  own  unique  opportuiuties 
for  incrementality.  There  is  considerable  research  yet  to  be  done  on  ad  hoc  incremental 
algorithms,  including  finding  good  incren^ntal  methods  for  classical  algorithnu,  devdoping 
general  paradigms  for  deriving  good  incremental  algorithms,  and  formulating  classifications 
that  characterize  degrees  to  which  algorithms  can  and  cannot  be  made  incrementaL 

The  focus  of  our  research,  however,  is  the  more  general,  algorithm-independent  ap¬ 
proach  to  incrementality.  In  this  approach,  the  problem  of  incrementality  is  orthogonal 
to  the  design  of  an  algorithm  A  for  a  spediied  problem.  Ideally,  A  can  be  designed  and 
programmed  in  a  standard,  non-incremental  manner.  These  algorithms  are  often  simpler 
to  design  and  implement  than  those  that  are  explicitly  incremental,  producing  programs 
whose  correctness  is  easier  to  verify.  The  translation  from  the  non-incremental  algorithm  A 
and  implementation  T  to  the  incremental  implementation  T'  is  automated.  The  research 
problems  in  this  area  include  the  creation  of  programming  languages  in  which  evaluation 
of  T  can  be  efficiently  updated  when  the  input  x  changes,  identification  of  abstract  data 
types  of  general  utility  for  which  efficient  incremental  updating  algorithms  can  be  devised, 
and  the  classification  of  problem  domains  for  which  the  algorithm-independent  approach  is 
viable. 

Approadies  to  incremental  computation  have  incorporated  two  distinct  prindples:  tak¬ 
ing  advantage  of  the  history  of  previous  computations  in  computing  new  results,  and  deter¬ 
mining  the  effect  on  the  output  of  a  relatively  small  change  to  an  input.  These  two  distinct 
approaches  can  be  illustrated  by  the  work  of  other  researchers  in  the  ONR-supported  com¬ 
puter  sdence  community: 
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•  The  perastant  data  structoies  of  Driscoll,  Samak,  Sleator,  and  Tarjan  [DSST89] 
address  a  key  issue  in  the  use  of  histories  for  incremental  computation:  how  the 
multiple  versions  of  a  data  structure  that  arise  in  the  course  of  the  computation  can 
be  maintained  in  a  space-efficient  manner  that  still  permits  time-efficient  access  to 
any  version. 

•  The  finite  differencing  of  Paige  [PK82],  a  generalization  of  optimization  by  strength 
reduction,  replaces  expensive  local  calculations  made  inside  loops  with  incremental 
counterparts  that  make  only  small  changes  to  large  non-local  data  structures.  An 
earlier  paper  of  Earley  ([Ear76]),  on  which  Paige’s  work  is  partially  based,  pmnted 
out  that  such  optimization  techniques  could  prove  equally  beneficial  in  implementing 
incremental  algorithms,  as  opposed  to  using  them  only  to  improve  non-incremental 
ones. 

Our  contributions  to  incremental  computation  have  involved  both  the  use  of  histories 
and  the  propagation  of  small  changes. 


5  Progress 

In  the  Synthesizer,  semantic  analysis  had  been  expressed  imperatively,  requiring  every  se¬ 
mantic  action  to  have  a  corresponding  undo  action.  In  our  new  approach,  we  identified 
[DRT81]  attribute  grammars  [Knu68]  as  a  promising  alternative.  Their  descriptive  power 
makes  attribute  grammars  applicable  to  a  wide  variety  of  objects  and  their  declarative  na¬ 
ture  eliminates  the  need  for  explicit  '^do”  actions.  In  this  framework,  complex  objects 
are  represented  as  consistently  attributed  derivation  trees  with  respect  to  a  given  attribute 
grammar.  Whenever  the  object  is  modified,  attribute  values  are  updated  to  restore  the  con¬ 
sistent  state  defined  by  the  attribute  equations.  We  developed  the  theoretical  foundations 
of  this  approach  to  building  incremental  systems  in  a  series  of  papers  that  culminated  in 
Reps’  Ph.D.  Thesis,  recipient  of  the  1983  ACM  Doctoral  Dissertation  Award  [Rep84].  The 
contributions  of  this  work  were  as  follows: 

•  It  proposed  the  attribute  grammar  model  of  incremental  computation  and  argued  its 
advantages. 

•  It  contained  optimal  evaluation  algorithms,  not  just  for  arbitrary  noncircular  attribute 
grammars,  but  for  the  absolutely  nondrcular  and  the  ordered  attribute  grammar 
subclasses  as  well. 


S7 


5 


•  It  pnaented  two  a^orithmi  that  carry  ont  attiibata  evaloatkm  while  redadng  the 
number  of  intermediate  attribute  values  retained.  While  others  had  worked  on  this 
problem,  these  algorithms  were  the  first  to  achieve  sublinear  worst-case  behavior. 

•  It  emphasised  the  importance  of  environment  generation  as  opposed  to  ad  hoc  con- 
strucrirm  techniques.  In  this,  we  were  certainly  not  alone.  Rather,  this  message 
reenforced  that  a(  Emily,  Mentrv,  and  Gandalf. 

•  It  demonstrated  the  possibility  of  applying  formal  techniques  and  rigorous  analysis 
to  a  fundamental  software  engineering  issue  and  stimulated  others  to  woric  on  the 
problem  of  incremental  static-semantic  analysis. 

Our  eariy  work  made  several  simplifying  assumptions  that  are  not  valid  in  practice. 
First,  our  notion  of  optimality  charged  for  changed  copy  attributes,  whose  only  function 
is  to  communicate  a  value  from  one  point  in  the  tree  to  another.  Second,  our  notion  of 
optimality  charged  for  tJi  uses  of  a  changed  aggregate-valued  attribute,  even  when  only  a 
single  component  of  the  aggregate  changes.  Third,  we  assumed  that  each  semantic  function 
is  a  constant-time  operation.  These  shortoimings  were  partially  addressed  in  Hoover’s 
Ph.D.  Thesis  [Hoo87],  whose  main  contributions  were  as  folloars: 


•  It  introduced  a  data  structure  called  the  structure  tree,  which  allows  transitive  de¬ 
pendencies  to  be  easily  and  eflldently  represented  in  the  attribute  dependency  graph 
and  maintained  in  the  presence  of  changes  to  the  dependency  graph. 

e  It  showed  how  structure  trees  can  be  tued  to  improve  incremental  evaluation  perfor¬ 
mance  when  the  dependency  gnq>h  contains  copy  rule  chains  and  aggr^ate-valued 
attributes. 

•  It  presented  a  new,  heuristic  incremental  evaluation  algorithm  that  appears  to  works 
well  in  practice,  although  its  running  time  is  not  guaranteed  to  be  linear  in  the  number 
of  attributes  changed.  This  new  algorithm  was  required  by  the  introduction  of  non¬ 
local  structure  tree  edges,  i.e.  transitive  dependency  edges  not  properly  embedded  in 
the  derivation  tree. 


The  attribute  grammar  approach  to  incremental  computatimi  is  primarily  history-based 
—  the  collection  of  saved  attributes  are  essentially  a  history  of  the  intermediate  values  that 
arise  in  the  course  of  a  computation.  Incremental  attribute  evaluates  work  by  restarting  the 
computation  in  the  middle,  at  exactly  the  right  place  with  respect  to  the  object’s  mutation. 
However,  Hoover’s  solution  to  the  problem  of  aggregate  attributes  introduces  elements  of 
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the  alternate  approach  to  incremental  computation,  the  propagation  of  small  changes,  and 
bears  a  resemblance  to  Page’s  finite  differencing  technique. 

Attribute  grammars  are  only  suitable  for  some  problems.  Fbr  instance,  if  attention  is 
restricted  to  noncircular  attribute  grammars  and  unit-time  semantic  functions,  only  linear¬ 
time  algorithms  can  be  expressed.  To  circumvent  this  limitation,  attribute  grammar  systems 
typically  allow  the  use  of  arbitrary  recursive  semantic  functions  —  for  which  incremental 
attribute  evaluation  schemes  provide  no  incrementahty.  Motivated  by  this  observation,  we 
have  begun  an  investigation  into  the  use  of  function  caching,  or  memoising,  to  provide  incre¬ 
mental  computation  within  function  evaluation  (PugSSb]  [PT89].  Faction  caching  may  be 
used  in  a  hybrid  system  in  conjunction  with  incremental  attribute  evaluation,  or  possibly, 
may  emerge  as  a  complete  alternative  to  the  attribute  grammar  approach  to  incremental 
computation. 

The  implementation  of  the  Synthesizer  Generator  [RT88a]  [RT88c],  a  tool  fiv  creating 
Synthesizer-like  language-based  environments  from  formal  specifications,  has  given  ns  the 
opportunity  to  both  prototype  these  research  developments  and  devdop  a  platform  for 
future  research  work.  The  research  uses  of  the  Synthesizer  Generator  within  our  group 
have  included  the  following: 


•  The  orif^al  T.  Reps  attribute  evaluation  algorithm  [Rep82]  [Rep83]  has  been  part  of 
the  Synthesizer  Generator  since  its  first  release. 

•  The  incremental  attribute  evaluator  develq;>ed  by  T.  Tdtelbaum  and  T.  Reps  [RT88b] 
is  currently  the  most  efficient  evaluator  in  the  released  system  for  grammars  that  fall 
into  the  class  of  ordered  attrihute  grammars  [KasSO]. 

•  S.  Horwitz  used  the  Generator  to  study  and  prototype  a  specification  formalism  based 
on  a  coupling  of  attribute  grammars  and  relational  databases  [HT85]  [Hor85]. 

•  R.  Hoover’s  work  on  incremental  graph  evaluation  [Hoo87]  [Hoo8€]  [HT86]  is  im¬ 
plemented  in  the  latest  version  of  the  Generator.  This  new  attribute  propagation 
algorithm,  in  many  cases,  significantly  reduces  the  size  of  the  set  of  attributes  that 
must  be  reevaluated  after  a  tree  modification. 

•  W.  Pugh’s  work  on  general  models  of  incremental  computation  using  lazy  structure 
sharing  and  memoizing  [PT89]  [Pug88a]  [Pug88b]  was  prototyped  in  the  Generator. 

•  S.  Peckham  is  currently  using  the  Generator  to  examine  extensions  of  attribute  propa¬ 
gation  algorithms  that  efficiently  handle  multiple,  asynchronous  modificatimis  [Pec88]. 
This  continues  w(vk  done  earlier  in  our  research  group  [RMT86]. 
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Among  onr  users,  we  nre  nwate  of  the  iiidlawing  pnblicntions  in  which  experience  with 
the  Generator  has  been  reported:  [GP]  (KKM87]  [vE89]  [NHWG88]  [NL88]  [FZ]  [FZCL88] 
[NBK88]  [CS87]  [Bm87]  [Slo87]  [BKJ88]  [CP89]  (BV87]  [Gri87]. 


6  Research  Directions 


Storing  inpnt/outpnt  pairs  for  a  function  in  a  cache  makes  it  poesible  to  avoid  repeated  re- 
calcnlation  of  the  function  on  exactly  the  tame  mput  However,  this  technique  only  partially 
addresses  the  problem  of  avoiding  redundant  calculations  on  composite  objects  that  may  be 
only  slightly  altered.  More  problematic  still  is  the  handling  of  changes  in  functional  values, 
which  occur  quite  naturally  when  specifying  the  semantics  of  programs  or  other  complex 
objects  or  systems  in  a  denotational  style.  We  believe  that  both  of  these  problems  can  be 
addressed  using  the  lambda-calculus,  a  formalism  in  which  both  functions  and  composite 
objects  csm  be  represented. 

The  problem  of  incremental  re-evaluation  of  lambda-terms  can  be  expressed  as  follows. 
Given  a  lambda-term  M  which  reduces  to  normal  form  N,  alter  M  slightly  to  yield  M*  (this 
may  correspond  to  editing  a  functional  value  or  some  composite  object).  We  then  widi  to 
determine  N*,  the  normal  form  of  M’,  using  as  much  of  the  information  already  known  from 
the  reduction  of  M  to  N  as  possible. 

The  set  of  intermediate  lambda  terms  produced  in  the  reduction  of  M  to  N  provides  a 
history  of  N’s  computation.  Our  idea  is  to  examine  N’s  history  to  determine  exactly  what 
parts  of  the  computation  depend  on  the  changed  part  of  M,  and  what  parts  do  not.  Then,  in 
principle,  only  those  subcomputations  depending  on  the  changes  to  M  must  be  recomputed. 
Other  subcomputations  are  invariant  and  can  be  incorporated  into  the  new  result. 

An  optimal  incronental  evaluation  of  M’  is  one  that  repeats  no  reduction  already  per¬ 
formed  in  the  evaluation  of  M.  Our  research  has  focussed  on  formal  techniques  for  analyzing 
reductions  to  determine  dependendes  on  the  initial  term,  determining  exactly  which  inter¬ 
mediate  values  need  to  be  remembered  to  enable  incremental  re-evaluaticm  and  which  can 
be  safdy  discarded,  and  practical  means  for  performing  reduction  such  that  these  values 
can  be  computed  and  stored  automatically. 

The  problem  of  incremental  reduction  of  terms  in  the  lambda-calculus  has  also  lead 
us  to  the  study  of  new  schemes  for  evaluation  of  lambda  terms.  It  turns  out  that  all  ex¬ 
isting  lambda-calculus  evaluation  techniques  are  sub-optimal,  in  that  they  perform  some 
redundant  or  unnecessary  calculations.  In  most  practical  settings  (i.e.,  execution  of  func- 
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tional  langaages),  the  conventicHU  used  by  programmen  and  commonly  tued  compilation 
techniques  minimize  the  impact  of  such  extra  work  by  the  evaluator.  However,  in  a  set¬ 
ting  where  functions  are  edited  and  interpreted  on  the  fly,  (as  in  practical  implementations 
of  denotational  semantics),  the  drawbacks  of  existing  interpreters  for  the  lambda  calculus 
become  more  apparent. 


7  Grand  Challenge 


Discover  how  to  create  incremental  software  from  non-incremental  software  automatically. 


8  Research  IVansitions 


The  wide  acceptance  of  the  Synthesizer  Generator  as  a  research  tool  by  the  domestic  and 
international  Computer  Science  community  has  been  one  of  the  most  satisfying  results  of 
our  work.  The  fact  that  over  250  sites  have  licensed  the  Generator  since  its  initial  rdease 
indicates  its  value  to  the  computer  science  community.  The  licensees  are  approximately 
one-half  domestic  and  one-half  overseas;  half  are  within  universities  and  colleges  and  half 
in  industrial  and  government  settings.  The  growth  in  the  number  of  sites  licensing  the 
Generator  has  been  essentially  linear  since  the  first  release.  Roughly  half  of  the  150 
receiving  Release  1  ordered  Release  2.  This  suggests  that  while  perhaps  half  of  our  over 
sites  are  mainly  curious,  the  remsdning  sites  are  making  serious  use  of  the  system. 

The  implementation  of  the  Synthesizer  Generator  has  been  a  side-effect  of  our  primary 
wOTk,  fundamental  research  in  incremental  computation.  Our  implementation  effort,  there¬ 
fore,  has  largely  focused  on  "proof  of  concept”  rather  than  concentrating  on  the  overall 
applicability  of  the  system.  For  example,  although  the  Synthesizer  Generator  has  been  a 
testbed  for  the  design  of  asymptotically  efficient  algorithms  (important  because  they  scale 
up),  little  effort  has  been  devoted  to  the  system’s  raw  performance.  Thus,  while  there  is 
little  doubt  of  the  essential  technical  feasibility  of  our  basic  research  ideas,  much  engineering 
remains  to  be  done  before  their  potential  is  fully  realized.  Such  an  effort  is  better  under¬ 
taken  in  a  commercial  rather  than  an  academic  setting.  The  mission  of  the  newly  founded 
firm  of  GrammaTech,  Inc.  is  to  establish  the  Synthesizer  Generator  as  a  fully-engineered 
commercial  product. 

Appart  from  the  success  of  the  Synthesizer  Generator  per  se,  we  believe  that  our  research 
ideas  have  been  well- received  and  have  established  something  of  a  following  for  the  attribute 
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grammar  approach  to  building  incremental  systems.  In  addition  to  our  software,  Reps’ 
award-winning  Ph.D.  Thesis  has  been  notably  influential. 


9  Recommendations  to  Funding  Agencies 

Consider  creating  an  initiative  in  incremental  computation. 
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O  \Jna{n)  log  n  m  log(nC)) 


lAl 
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0(nm  log  n) 
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O  {nm  log  n  log(nC)) 
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